SQL Theory: 15 Steps to Convert Access Data to SQL Server
15 Steps to Convert Access Data
to SQL Server
Sep 30, 1999 | SQL Server Pro
I work for a small consulting company that has provided clients
with Microsoft Access desktop solutions since the early 1990s. Three clients
had recently outgrown their multiuser Access applications and were looking for
help with upsizing to SQL Server. Although my consulting company had offered
only desktop solutions up to this point, it decided to enter the client-server
market. In preparation, several other consultants and I decided to upsize a
mid-sized Access system to SQL Server.
For our
training project, we chose to upsize an Access system that we had set up for a
pharmaceutical company. This system, which tracked the required courses and
physical exams that employees took, was a typical Access database for us in
terms of size, complexity, and design. The system had an Access interface and a
Web browser interface. Although this database wasn’t large, the IT staff
members at this site wanted to upsize it so that they didn't have to disconnect
the data from the Web server every time they needed to repair and compact the
data.
When we
had initially set up this client's Access system, we had separated the data
(i.e., tables) and the code (i.e., queries, forms, reports, macros, and
modules) into two .mdb files. The data .mdb file contained about 7MB of data in
36 tables. The code .mdb file contained about 5MB of code. Because we always
normalize to at least the third normal form and enforce referential integrity,
we had created 39 enforced relationships. All the table relationships had
cascading deletes. (Cascading updates weren't necessary.)
In every
table, we had used an AutoNumber field as a bookmark primary key to eliminate
multiple-field primary keys, which can greatly slow insert and update
performance.Figure 1 illustrates how this approach works.
As Figure 1 shows, the intersection table between tblTrainee and tblCourse is
tblTraineeCourse. This table, which records the courses that trainees take, has
three fields:
·
bmk_TraineeCourse, the AutoNumber
primary key
·
fgn_Trainee, a long integer foreign
key to the tblTrainee table bookmark
·
fgn_Course, a long integer foreign
key to the tblCourse table bookmark
Of
similar design is tblTraineeExamReq, an intersection table that uses the
AutoNumber primary key of bmk_TraineeExamReq to record the physical exams that
trainees take.
In the
conversion to SQL Server, we needed to preserve the value of all the AutoNumber
fields. To find out how to preserve Access data, we conducted research. We
found a lot of information about upsizing Access applications. However, most of
the information discussed how to optimize the application after you upsize the
data. Little information existed on how to use the Upsizing Wizard, which is
part of the Microsoft SQL Server Upsizing Tools for Access 97, to load the
Access data into SQL Server. Therefore, we thought that using the Upsizing
Wizard would be easy. We were wrong. We worked for almost 24 hours before we
finally had the 7MB of data and constraints in place.
Instead
of discussing our many failures, I'll discuss data upsizing procedure that we
now use regularly to upsize small and mid-sized Access databases. (If you have
a large Access database, you need a bulk copy program—bcp.) Our data upsizing
procedure has 15 steps.
Step 1: Create the Database and Log
Devices
We had
heard that if you create a database device twice the size of the Access .mdb
file, you'll have plenty of room for the conversion. However, that estimate is
incorrect if you have Memo fields. Memo fields, which upsize to Text columns,
take up an enormous amount of space because the Upsizing Wizard creates a
Timestamp column for each field. For example, in our case, one particularly
memo-heavy application went from being a 3MB file in Access to a 20MB file in
SQL Server. Thus, we created database and log devices that were at least five
times bigger than the.mdb file.
Step 2: Create the Database
You must
create the database and an ODBC data source for SQL Server’s master database.
The ODBC data source obtains storage information when you ask the Upsizing
Wizard to create the database.
Step 3: Gather Data on Table
Relationships
You must
use the Access Database Documenter to gather information about the table
relationships in the database. If you didn’t install the Documenter, you need
to. You can download the Documenter from the Microsoft Office setup disk.
To gather
the table relationship data, choose Tools, Analyze, Documenter from your Access
.mdb file database window menu. After selecting the Current Database tab, choose
Relationships and click OK to print the details of all the relationships in the
database. If the Documenter isn’t working properly, run it from the
Relationship window. Review the printout to make sure the referential integrity
and cascades are set correctly. Keep this printout handy because it can help
you determine the table dependency order in step 6.
Step 4: Make the .mdb File Compliant
The names
in the .mdb file must comply with the rules for SQL Server 6.5 identifiers. If
the table names or field names aren't valid SQL Server identifiers, the
Upsizing Wizard forces the table to become SQL Server compliant. If a name is
longer than 30 characters, the wizard truncates the name to 30 characters. If a
name contains a space or an illegal character, the wizard changes the space or
character to an underscore (_). (For a list of illegal characters, see SQL Server Books Online—BOL.) The wizard then
creates an Aliasing query, which is an Access query that references the new SQL
Server table. The Aliasing query takes the name of the original table and
translates all the new SQL Server-compliant field names to their original
names.
For
example, suppose you have an Access (JET) table named Employee Job Title, which has the field DescriptionOfJobResponsibilities.
Because SQL Server doesn't recognize names with spaces or names containing more
than 30 characters, the Upsizing Wizard creates the table as Employee_Job_Title and the field as DescriptionOfJobResponsibiliti. The
wizard then attaches the new SQL Server table Employee_Job_Title to the Access table Employee_Job_Title_remote. Finally, the
wizard creates an Aliasing query calledEmployee Job Title that references
Employee_Job_Title_remote and renames the field DescriptionOfJobResponsibiliti
to DescriptionOfJobResponsibilities.
Although the application will work, it references an Aliasing query rather than
a linked table. As a result, the application will run much slower that if it
referenced a linked table.
You can
use your favorite search-and-replace utility to make the names in the .mdb file
compliant with the rules for SQL Server 6.5 identifiers. Our favorite utility is Find and
Replace, a search-and-replace utility for Access. Don't try to
manually find and replace all spaces and illegal characters. If you miss just
one space or character, you’ll have to rerun the Upsizing Wizard.
Step 5: Remove All Recursive
Relationships
Recursive
relationships are relationships between two fields in the same table that you
enforce. An example is an Employee table with a foreign key Supervisor that
references the primary key of that Employee table. If you try to upsize this
table, the Upsizing Wizard will have no problems creating the table structure.
However, the wizard will encounter numerous difficulties loading the data
because it has to load all the Supervisors before it loads the other
Employees—and most Access systems don't organize data that neatly. Therefore,
you must remove all recursive relationships and write triggers to enforce
recursive relationships. Listing 1 contains example SQL trigger code, including
code that enforces recursive relationships. Another way to enforce recursive
relationships is to remove all recursive relationships and then, after the
conversion, create foreign keys that reference tables' primary keys.
Step 6: Determine the Table
Dependency Order
When you
load data, you must first load the tables with no foreign keys (i.e.,
independent or first-level tables), followed by the tables that depend only on
first-level tables (i.e., second-level tables). Then you load the tables that
depend only on first- or second-level tables (i.e., third-level tables), and so
on. (First-level tables are parent tables to second-level tables, which are
child tables to first-level tables. This analogy continues down through the
table levels.) Establishing the table dependency order is important because if
you load data out of order, the load fails.
Step 7: Check All Default Values
Checking
all default values is important, especially if you have Number fields with a
default value of 0 in tables that you don’t want to allow 0 values. For
example, if you reference the Identity field (which is similar to the Access
AutoNumber field) of a SQL Server parent table as an optional foreign key in a
child table, the default is 0, even if you allow only nulls and those values in
the parent table. Here's why this problem occurs. Access displays the defaults
before you save a record. Thus, you could see a 0 in the Access interface and
remove it to set the field to null. However, when you attach SQL Server tables,
Access no longer displays the defaults. Thus, it looks like you're entering a
null, but when you save the record, SQL Server tries to insert the 0 value and
the record update fails.
Step 8: Move the Data Tables into the
Code File
You need
to perform this step only if your data and code are in separate databases. (In
our training project, we performed this step because we had separated the data
and the code into two .mdb files.) When the Upsizing Wizard runs, it attaches
all the SQL Server tables to the database from which you're running the wizard.
If you leave the tables in a separate database and run the wizard from that
tables database, when you link your code database to the new SQL Server tables,
the wizard names all the newly linked SQL Server tables dbo_TableName. You must then manually rename
each table. You can avoid this hassle by importing all the tables from the data
.mdb file into the code .mdb file before you run the Upsizing Wizard. Be sure
to include the relationships, structure, and data.
Step 9: Make a Backup Copy of Your
Files
Backing
up your files at this point in the data upsizing procedure is crucial. You’ve
probably spent at least 2 hours preparing the files for upsizing. If anything
goes wrong and you must rerun the Upsizing Wizard, you don't want to have to start
over. Although this step might seem obvious, we were so excited about running
the wizard that we forgot to save our SQL-ready files in our first two upsizing
operations. We weren't too happy when we had to reconstruct all our work.
Step 10: Run the Upsizing Wizard
When you
run the Upsizing Wizard, you should upsize all the tables at once to ensure
that the table relationships stay intact. If Table A has a relationship with
Table B and you upsize Table A but not Table B, the wizard breaks the
relationship between the two tables.
The
Upsizing Wizard lets you specify the table attributes to upsize, the data
options to include, and any database modifications. As Screen 1 shows,
run the Upsizing Wizard with these options:
·
Use Declarative Referential Integrity
(DRI), not triggers, for relationships. If you don't have much experience with
SQL Server, DRI is more predictable and easier to use than triggers. You can
have problems with referential integrity if you choose triggers; the insert and
delete operations might behave unpredictably. However, if you use DRI, you must
write triggers for cascading deletes.
·
Don't let the wizard decide where to
put the timestamps if you don't have a strong preference. The upside to
timestamps is that they track the sequence of changes in your database if you
have the time and expertise to figure out how they work. The downside to
timestamps is twofold. First, timestamps add data storage space to your database.
Second, they make it significantly more difficult to bcp data from Access
tables to SQL Server tables. If you let the wizard place the timestamps, you
must do more advanced bcp formatting so SQL Server knows how to map the fields
from the text file to the database fields. Advanced bcp formatting is a
time-consuming, tedious process.
·
Create the table structure only—don't
upsize any data. Telling the wizard to only create the table structure is
essential if you have a system with referential integrity. If you let the
wizard upsize the data, it might upsize child tables before parent tables,
causing the data load for those tables to fail.
·
Link newly created SQL Server tables
and save the Password information with the linked tables (unless this setup violates
security constraints). Otherwise, when you open the Access file, you'll likely
get a prompt for a password to reestablish your connection with SQL Server.
(Whether you get this prompt depends on the SQL Server security mechanism in
place.)
After you
specify these options, click Next to go to the final wizard screen. This screen
gives you the option of creating a log report. We can’t stress strongly enough
how helpful log reports are in determining what happened during the upsizing
operation. After you select whether you want this report, click Finish to run
the wizard.
Step 11: Check the Log
After the
Upsizing Wizard runs, print the log immediately because the wizard deletes this
log. Check the log for Aliasing queries and other errors. If you get errors,
you must decide whether to work around them or redo the data upsizing
procedure. If you decide on the latter, you must delete your newly created
database, revert to your saved .mdb files, fix the problems on the Access side,
back up your .mdb files again, and rerun the Upsizing Wizard.
The log
might show that your table-level validation rules didn't upsize. The Upsizing
Wizard doesn't upsize these rules, so you must write triggers for them.
Step 12: Back Up the New SQL Server
Database
Now that
you have the data structures intact, run a full SQL Server database backup.
Don’t forget to also back up your master database.
Step 13: Load Access Data into SQL
Server
Following
the order you established in step 6, load the Access data into SQL Server,
table by table. If you have a database with no AutoNumber fields, the data
should migrate to SQL Server with few problems. You can simply run Append
queries for all your tables. If you have large tables, you might need to run
several smaller queries for each table.
If you
have a database with AutoNumber fields, you can't use Append queries. These
queries don't preserve the data values in the AutoNumber fields because the
fields become SQL Server Identity columns. As a result, if another table uses
the AutoNumber field as a referencing field in an enforced relationship, the
child record often references the wrong parent record or won’t load into SQL
Server at all.
In our
case, every table except for the Constants table had AutoNumber fields. Thus,
we ran a Pass-Through query for each table. We used this type of query to set
the table's IDENTITY_INSERT to ON, append the data, and then set the table's
IDENTITY_INSERT to OFF. (Only one table per database can have its
IDENTITY_INSERT enabled at a time.)
To
partially automate the Pass-Through process, we used the LoadData subroutine in
Listing 2. You can’t totally automate the process because you must load the
tables individually in the correct order. Before you use this subroutine, you
need to substitute the correct Data Source Name (DSN), UserID, password, and
database name for both qdf.Connect strings. You run this subroutine from the
Debug window, passing it the names of the tables one by one.
If a
table is too big to load at one time, uncomment the line that sets the ODBCTimeout
to 0 so the process can continue indefinitely. If you uncomment this line, be
patient—your computer might appear to hang, but it’s probably running fine.
Another
approach for handling large tables is to break the load into manageable
segments by adding a WHERE clause to the qdf2.SQL statement. Put the name of
the AutoNumber field in the WHERE clause, and choose appropriate values so you
load only a few thousand records at a time. If the table has numerous OLE or
Memo fields, you might want to load only a few hundred at a time. Keep changing
the value in the WHERE clause and rerunning the procedure until all the data is
in the table.
Every few
tables, make sure all the data is loading. Sometimes we had data that wouldn't
load because of omitted optional foreign keys. Other times we couldn’t figure
out why records weren’t loading. We even checked for triggers that didn’t allow
nulls but found nothing. Oddly enough, the solution in those cases was to
delete the DRI foreign key in the child table and recreate the table.
Step 14: Write Triggers
For
relationships with cascading events, you must delete the DRI foreign keys and
write triggers. You also need to write triggers for recursive relationships and
table-level validation rules. Writing triggers is typically a time-consuming
process—but don’t rush it and be thorough because triggers ensure data
integrity. Listing 1 contains example SQL Server triggers.
Step 15: Test the New SQL Server
System
You must
test the SQL Server system numerous times. Add new data to every table and
break all the rules to make sure the system acts predictably.
15 Steps to Success
When
planning an Access-to-SQL Server upsizing project, don’t underestimate the time
you need for data conversion. By following the tips in these 15 steps, I hope
you'll be spared a few painful hours we spent banging our heads against the
Upsizing Wizard. Now that we’ve performed this data upsizing procedure a few
times, we’re down to an average of 11 hours for small data-conversion projects.
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home