Author: Igor Petelin, Software Engineer
When migrating data to SugarCRM we faced the issue of filling in region, city, district, street, building classifiers. The information was provided as Microsoft Excel file.
The file’s structure was the following:
For carrying out this task we chose Talend Open Studio for Data Integration, which is a powerful tool for data processing.
To read the Excel file data we can use a standard tFileInputExcel component. The component reads the data of .xls or .xlsx files and retrieves data line by line.
So, let’s begin. First of all, we need to place the component on the design window and go to the parameters by a double click on the component. Here we have to locate a file, file scheme and title (so that the first line does not get into the database).
Since the file has only two columns, a built-in schema can be used. That is what we got:
In advanced settings we tick Trim all columns, in order not to let extra empty meaningless symbols into the database.
We’ve figured it all out with the component for retrieving data; now let’s move on to data input into the database. For this we need to place tMap and tMSSqlOutput components on the design window and connect them.
tMSSqlOutput is a component for inputting data into MS SQL database. Here we need to configure server connection, set the schema and the table name into which the data will be entered. As long as database connection parameters can change, we moved them to Context. Context is the user’s parameters that can be transferred to Job in the process of running the program.
The settings are as follows:
For the components that work with databases, it’s not necessary to create the schema manually, it’s enough to retrieve it with the help of standard Talend methods. For this we press the database connection with the right button and select Retrieve Schema,
click Next and select the required table. It looks like this:
Next, we just need to set the Repository schema in the tMSSqlOutput component parameters and select metadata with the name of the table we require.
The same way we fill in the region, city, district, street, building classifiers. Then we need to extend the software functionality, so we will:
- Teach the utility to read the configuration file parameters;
- Add the check for the existence of files with classifiers;
- Add the elements of actions logging and recording these actions in the file, to see how the process runs.
The simplified scheme of the utility looks like this:
The configuration file looks as follows:
As you can see, we’ve moved some settings to the configuration file, namely:
- path to classifiers files;
- names of classifiers files;
- user and team under which we create classifiers logs/records in the database;
- date of records creation.
We run the utility and see the result of data migration to SugarCRM:
The advantages are the following:
- Talend provides simple and convenient solutions for data migration and transformation in SugarCRM;
- Data migration goes on directly from Talend, also independent utilities can be collected;
- The ready utility can be run both on Windows and Linux platforms;
- The utility flexibility. We can change the location or files names, database connection parameters without introducing changes to the utility;
- We can observe the data migration process and log the actions in the file.
The advantages of the current utility version include the following: we do not check the file as for the structure correctness; also we do not check the record existence in the database. But it can be improved in the coming versions.
To conclude, Talend Open Studio for Data Integration is a convenient product for SugarCRM data migration. The data transfer can start either form the software itself or from the independent utilities, that we can build for Windows and Linux.
We also have had a chance to see how Talend works with Excel files and MS SQL database, and learned to configure components for their work.