Sunday, September 21, 2014

Auto syncronize Salesforce data to MySQL or Postgresql with Talend Open Studio

Today I'm writing about using Talend Open Studio to sync Salesforce data to MySQL or Postgresql databases.

First of all, let's download and install Talend Open Studio for Data Integration, make sure you already have Java installed.

Next is unzip and run TOS_DI-macosx-cocoa as I'm using a Mac OS X, or TOS_DI-win-x86_64.exe if you're using Windows.
It's based on Eclipse IDE so you'll see it's pretty easy if you used to work with Eclipse.

Here is the main screen.


Then we will need to create a database connection by expanding the Metadata node -> Db connections -> Right click -> Create connection -> enter a Name (without space) -> click Next button

Then select Db Type (MySQL or Postgresql or whatever in the long list). I'm choosing Postgresql because MySQL limits row size to 65535. So if the table contains too much fields, it will throw out an error: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs

After enter all connection information, click Check button to make sure they are correct. Then Finish.


Next step is creating Salesforce connection. Let's right click the Salesforce node and select Create Salesforce connection. Enter a name then Next.


 Enter the User name and Password.  The Password is combination of Salesforce password and the token. i.e if the Salesforce password is xyz and the token is 1234, then we enter xyz1234 on the Password field. Then Check Login to make sure connection is valid. Then Finish.

Next step is retrieving Salesforce modules by right click on created Salesforce connection and select Retrieve Salesforce modules. Let's check for Account


 Then we will need to create a job to import Salesforce data into the database. Right click on Job Design node, select Create Jobs

Then job design screen is opened and we will expand the Salesforce Account module to drag/drop the Account table into the job design screen as tSalesforceInput component.

Click OK then we have it on the screen

Next step is search for database output component on the Palette box, then drag/drop tPostgresqlOutput component into the screen


 Then we might need to rename it as "account" and click the inside icon to set its properties
Select Repository on Property Type , then select created Postgresql connection - DB (POSTGRESQL):PosgressqlSalesforce 
All connection information automatically filled out. Except Table - enter "account" here, then select Create table if does not exist for Action for table, and Insert for Action for data

 Next step is linking between Salesforce input and Postgresql output by right click on Salesforce input's icon and select Row -> Main, then wire the connection to the Postgresql output's icon
and Run the job.

Then let's check the Postgresql to see the new account table is created with data from Salesforce
Then double the Postgrsql account output to edit its properties, select Insert or Update for Action on data and click Edit schema button, then select the checkbox Key on the Id column of account Output

 Then we might run the job again.
We might repeat those steps to sync other tables.

Now if we want to execute the job automatically by schedule / cron, we will need to build the job.
Right click on the job node, select Build Job


Then unzip the file, explore to the sub folder sync, we will see sync_run.sh or sync_run.bat (or jobName_run.bat/.sh) that will be called in schedule / cron.

So we are done!

FYI, here is the videocast to show you how to sync from database to Salesforce.



4 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. For automatical no-coding synchronization between Salesforce and SQL databases (MySQL or Postgresql), you can use cloud-based solution - Skyvia (https://skyvia.com/data-integration/synchronization). It has wide capabilities and the wide array of data sources supported. In addition, it has flexible scheduling features, understandable UI and moderate price for the service.

    ReplyDelete
  4. Thanks for posting this, Keep Sharing good content.
    Salesforce Online Training

    ReplyDelete