What makes us different? White Paper (Brief)

FastBIT™ Patching Process



A Deeper Look into Our Technology

The FastBIT™ patching process is the core technology behind our revolutionary program. The patching process involves the comparison of two different versions of the same file and extracting the differences between the files. When the differences are extracted from the two files, they are saved into a new file and compressed into what is known as a Patch. The patch file is often 85% to 99.9% smaller than the file from which the patch was originally extracted.

Diagram of the FastBIT Patching Process Diagram of the FastBIT Patching Process Applying this technology to the backup process will reduce the use of communication lines, backup tapes, and physical storage, all leading to one thing...savings! Reduce your costs without sacrificing the integrity of your backups. Here's how our FastBIT patching technology is applied to your backup process.

When our backup program encounters a file for the first time, it compresses the file and sends it securely to the backup server. Once a file has been compressed and sent to the server, the ENTIRE file will never again be sent to the server. All future changes to your files will result in only the changes within the files being sent to the server. When the changes are received by the server, they are applied to your backup files creating a complete up-to-date copy of your file system. As an optional service, your daily FastBIT Patch Backup files can be stored separately on the server allowing for the flexibility of restoring any file(s) from your backup data from any point in time.

If you're wondering about the reliability of the patching technology, there's no need to wonder anymore. This technology is not new. It has been the only choice for IBM, Microsoft, Novell, and many other hardware and software companies needing to update commercially distributed software. NovaStor is the first to critically integrate this existing technology into a high performance backup application.

The FastBIT patching process manipulates files at the binary level. This means it can process any file type without error. Different types of files will yield different FastBIT patch sizes based on the binary organization of the file. We provide some FastBIT patch statistics to illustrate this point further.

File Type

Original File Size (in bytes)

Change Description

Changed File Size (in bytes)

Patch Size (in bytes)

% Reduction patch/changed file

Windows BMP (8-bit)

307,514

Added text to center of image

308,278

2,615

99.15%

Microsoft Word v7.0

431,616

Copied text from middle and pasted at end

448,512

13,598

90.60%

Microsoft Excel v7.0

108,544

Inserted new worksheet; created basic calculation and added 3D Bar-Graph

114,176

5,915

94.82%

Microsoft Access v2.0

1,802,240

Added 3 new records

1,802,240

5,700

99.68%

Intuit QuickBooks

1,265,664

Paid 4 bills and added 2 invoices

1,301,504

8,074

99.38%

Photoshop File

515,473

Added new layer and added text to new layer

524,769

4,480

99.15%

Plain Text File

37,084

Added text to beginning, middle and end of file

39,123

1,285

96.72%

Total

4,468,135

Average Daily Backup

4,538,602

41,667

99.08%

FastBIT reduces the average daily backup by OVER 99%


A New Breed of Backup System and the Underlying Technology

Introduction

Today, IS professionals are facing a dilemma that, lacking a solution, has far-reaching implications for the future: how to implement an effective backup policy utilizing current network infrastructure? The solution becomes more elusive when we add to it the trend towards "maximum" computing. Demand for feature rich applications sporting graphics and multimedia effects has given rise to a significant increase in data file size and continues to trend upward.

Therefore, an efficient backup strategy is rapidly becoming a top priority for IS. As "backup windows" decrease, file size increases, and file locations become decentralized, IS professionals are looking to create a more easily managed backup environment. To that end, focus has turned toward establishing a centralized backup methodology utilizing the Client/Server model found in many of today's most advanced applications.

Current Technology

To meet the demand for a centralized "Server Centric" backup policy, software developers have created some noteworthy applications. Many utilize "clients" or "agents" residing on workstations that permit the server access to remote workstation files during an enterprise backup session.

However, there is one major underlying factor that diminishes the effectiveness of most of these programs: full file incremental backups. More specifically, any minor change in a file requires the backup of the entire contents of that file. There are obvious ramifications as the size of data files increases and the network bandwidth to back them up decreases.

This gives rise to an important observation: while workers may be creating larger files, daily changes to those files are, on average, small. This leads to the obvious conclusion that if there were a procedure in place to permit the extraction and backup of only those portions of a file that change day to day, backup size and time would dramatically decrease.

The Next Step

While not new, the concept of backing up only binary changes to data has, nonetheless, eluded backup software developers. However, if one delves into the actual mechanism of such a function, one quickly realizes that the process is not as straightforward as one’s first observation might suggest. In fact, it is far more complex. It is this complexity that has relegated the concept to being just that – a concept. Until now.

Recently, programs that perform "Televaulting", or off-site backups, have been receiving significant press coverage. Utilizing standard telecommunications devices such as modems and ISDN adapters, these backup applications collect and backup changed data to a remote site. However, if one scrutinizes this process carefully, one quickly realizes that, using current technology, such an application would have little use in a large-scale business environment.

To increase acceptance of televaulting as a viable backup solution for the corporate world, developers have invested a significant amount of time and expense into improving the underlying technology. Two significant innovations have come from these efforts, both of which permit discrete data changes to be backed up instead of the entire file.

Block Technology

The first innovation to come from the development of the latest backup software is referred to as "block technology". In one form or another, block technology has been around for some time and was originally developed as a method for mirroring data from one hard drive to another.

In essence, the block technology process evaluates changed data by breaking a file down into discrete blocks of information. These blocks are typically between 4 and 32 kilobytes in size. Through the use of a cyclic redundancy check (CRC), block technology compares each block of a modified file with the corresponding block in the previous version of that file. When the process detects a difference, it extracts a copy of that discrete block, not the entire file. In practice, changes in files will usually result in a number of blocks being copied. However, the cumulative size of these blocks will be less than that of the original file. This has the effect of reducing the total backup size and time.

However, observing block technology in action reveals that it produces larger file sizes than one would expect. This is, in part, is due to the use of a fixed block size. If only 1 kilobyte of data has changed, but the block size is 16 kilobytes, the entire 16-kilobyte block is extracted. Combine this with similar changes to other blocks and one will observe that the size of the extracted data can be significantly greater than the actual size of the changed data.

FastBIT tmBinary Patching

The second backup technology making headlines today is "FastBITtmbinary patching". Originally developed over 8 years ago as a method for upgrading software, binary patching has received widespread acceptance by many of the world's largest hardware and software manufacturers including IBM, Compaq, and Microsoft.

To cut costs and decrease the time to market, manufacturers distribute their updates as tiny files or "patches" containing only the binary difference between the old and new version of their software. Once received by the client, these patches are applied or merged into the existing file instantly upgrading it to the latest release. An obvious advantage is that the size of the upgrade is reduced significantly. This permits clients to use modem dial-up connections to obtain software updates instead of the more traditional forms of distribution such as floppy disk or CD-ROM.

Although FastBIT binary patching may sound similar to block technology, it differs in one significant aspect: FastBIT binary patching does not evaluate a file as a collection of discrete blocks rather, as a continuous string of binary data.

Utilizing a complex algorithm and special memory management, FastBIT binary patching is capable of comparing files and extracting "patches" of binary data that represent only the specific changes to those files. Simply put, if only 1 kilobyte of data has actually changed in the file, then only a 1-kilobyte patch is extracted for backup thus eliminating the overhead imposed by block technology methodology.

In a real world application backup scenario, each discrete patch is combined with those from other files into a single archive and then compressed. This compressed archive is transferred to a backup server, and the patches are extracted and saved either discretely or are applied to the server's copy of the original data file.

Observing the binary patching process, one can quickly see a significant decrease in backup size over that of the block technology system. This is clearly demonstrated in table 1, which outlines the results of a carefully designed and executed test.

Empirical Comparison

To better understand the effectiveness of block technology versus FastBIT binary patching, a simulated workflow model was created that closely approximates that of the average business-computing environment. Table 1 outlines the results of applying this workflow model to a group of 5 file sets that one might find in the average corporation.

While it is obvious that each technology produced backup files substantially smaller than the original, it is evident that FastBIT binary patching significantly outperformed block technology in every instance. Moreover, while the results may seem inconsequential at this level, when multiplying these figures by the large number of users an average corporation might have, the difference becomes staggering.