Batch Data Sync Example
1. Overview
This example will take you through the creation and execution of a batch data sync where data will be loaded into the Cinchy via a CSV. In this example, we will be loading information into the People table in Cinchy. This is a self-contained example and can be replicated in any Cinchy environment without dependencies.
Example Use Case: You have historically maintained a record of all of your employees in a spreadsheet. Knowing that this significantly hinders your data and data management capabilities, you want to sync your file into Cinchy. Once synced, you can manage your employee information through the Cinchy data browser, instead of through a data silo.
You can review our documentation on Delimited File Sources here.
2. Sample Files and Code
This section contains:
The People Table XML schema. Alternatively, this can be manually created through the Cinchy Data Browser
A sample source CSV data file to load into Cinchy.
2.1 Cinchy Table XML
To create the People table used in this example, you can use the below is the XML Alternatively, you can create the table manually using the steps in section 2.1.1.
2.1.1 Manual Table Creation
Log in to your Cinchy platform.
From under My Network, click the create button.
Select Table.
Select From Scratch.
Create a table with the following properties (Image 1):
Table Details | Values |
Table Name | People |
Icon + Colour | Default |
Domain | Sandbox (if this domain does not exist please either create it or make sure to update this parameter where required during the data sync) |
6. Click Columns in the left hand navigation to create the columns for the table.
7. Click the "Click Here to Add" button and aadd the following columns.
Column Details | Values |
Column 1 | Column Name: Name Data Type: Text |
Column 2 | Column Name: Title Data Type: Text |
Column 3 | Column Name: Company Data Type: Text |
9. Click the Save button to save your table.
2.2 Data File
The sample CSV file used in this example can be downloaded below.
Please note, if you are downloading this file to recreate this exercise the file path and the file name will need to be:
C:\Data\contacts.csv
Alternatively, you can update the path
parameter in the data sync configuration to match the file path and name of your choosing.
The source file contains the following information which will be synced into our target Cinchy table (Image 2).
As we can see, the file has the following columns:
First Name
Last Name
Email Address
Title
Company
The People table created only has the following columns:
Name
Title
Company
When syncing the data from our source (CSV file) to our target (Cinchy People table) our batch data sync will need to take into consideration the following:
The first and last name from the source will need to end up merged into one column in our target (Name)
Email address from our sources is not a column in the target, therefore his column will not be synced into the target
The title column will be a one to one match from source to target
The company column will also be a one to one match from source to target
3. Creating the Data Sync
There are two options when you want to create a data sync in Cinchy.
You can input all of your necessary information through the intuitive Connections UI. Once saved, all of this data is uploaded as an XML into the Data Sync configurations table.
Or, you can bypass the UI and upload your XML config directly into the Data Sync configuration table yourself.
This example will walk you through option one.
3.1 Using the Connections UI
Within your Cinchy platform, navigate to the Connections Experience (Image 3).
In the Info tab, input the name of your data sync. For this example we are using "Contact Import" (Image 4).
Since this is a local file upload, we also need to set a Parameter. This value will be referenced in the "path" value of the Load Metadata box in step 5. For this example, we will set it to filepath (Image 5).
Navigate to the Source tab. As a reminder, we are using the .CSV file you downloaded at the beginning of this example as our source.
Under "Select a Source", choose Delimited File (Image 6).
The "Load Metadata" box will appear; this is where you will define some important values about your source needed for the data sync to execute. Using the below table as your guide, fill in your metadata parameters (Image 7):
Parameter | Description | Example |
---|---|---|
Source | The source location of your file. This can be either Local, S3, or Azure Blob Storage. | Local |
Delimiter | The type of delimiter on your source file. | Since our file is a CSV, the delimiter is a comma, and we uses the ',' value. |
Text Qualifier | A text qualifier is a character used to distinguish the point at which the contents of a text field should begin and end. | "" |
Header Rows to Ignore | The number of records from the top of the file to ignore before the data starts (includes column header). | 1 |
Path | The path to your source file. In this case, it is the Parameter that was set in step 3. | @filepath |
Choose File | This option will appear once you've correctly set your Path value. | Upload the sample CSV for this example. |
Click Load.
In the Available Columns pop-up, select all of the columns that you want to import from the CSV. For this example, we will select them all (noting, however, that we will only map a few of them later) (Image 8).
Click Load.
Once you load your source, the schema section of the page will auto-populate with the columns that you selected in step 7 (Image 9). Review the schema to ensure it has the correct Name and Data Type. You may also choose to set any Aliases or add a Description.
Navigate to the Destination tab and select Cinchy Table from the drop down.
In the Load Metadata pop-up, input the Domain and Table name for your destination. In this example, we are using the Sandbox domain and the People table (Image 10).
Click Load.
Select the columns that you wish to use in your data sync (Image 11). These will be the columns that your source syncs to. In this example, we are using the Name, Title, and Company columns. Note that you will have many Cinchy system table available to use as well. Click Load.
The Connections experience will attempt to automatically map your source and destination columns based on matching names. In the below screenshot, it has been able to correctly match the "Company" and "Title" columns (Image 12). The "Name" target column is not a 1:1 match for any of our source columns, so we must match that one manually.
Select "First Name" from the Source Column drop down to finish mapping our data sync (Image 13).
Navigate to the Sync Behaviour tab. There are two options for data syncs: Full File and Delta. In this example, select Full File.
Full load processing means that the entire amount of data is imported iteratively the first time a data source is loaded into the data studio. Delta processing, on the other hand, means loading the data incrementally, loading the source data at specific pre-established intervals.
Read more about Sync Behaviour types here.
Set the following parameters (Image 14):
Parameter | Description | Example |
---|---|---|
Sync Key Column Reference | The SyncKey is used as a unique key reference when syncing the data from the data source into the Cinchy table. It is used to match data between the source and the target. This allows for updates to occur on changed records. | Name |
New Record Behaviour | This defines what action is taken when a new record is found in the sync source. This can be either Insert or Ignore. | Insert |
Dropped Record Behaviour | This defines what action is taken when a dropped record is found in the sync source. This can be either Delete, Ignore, or Expire. | Delete |
Changed Record Behaviour | This defines what action is taken when a changed record is found in the sync source. This can be either Update, Ignore, or Conditional. | Update |
Navigate to the Permissions tab. Here you will define your group access controls for your data sync (Image 15). You can set this how you like. For this example, we are giving all user access to Execute, Write, and Read our sync.
Any groups given Admin Access will have the ability to Execute, Write, and Read the data sync.
Navigate to the Jobs tab. Here you will see a record of all successful or failed jobs for this data sync.
Select "Start a Job" (Image 16).
Load your sample .CSV file in the pop-up window (Image 17).
The job will commence. The Execution window that pops up will help you to verify that your data sync is progressing (Image 18).
Navigate to your destination table to ensure that your data populated correctly (Image 19).
3.2 Using a Data Sync XML
In lieu of using the Connections UI, you can also set up a data sync by uploading a correctly formatted XML into the Data Sync Configs table within Cinchy.
We recommend only doing so once you have a good grasp on how data sync work. Note that not all sources/targets follow the same XML pattern.
Below is the completed batch data sync configuration for this example. Review the XMLs and then proceed to section 3.2.3 for further instructions.
3.2.1 Blank XML Example
The below XML shows what a blank data sync could look like for a Delimited File source to a Cinchy Table target.
3.2.2 Filled XML Example
The below filled XML example matches the Connections UI configuration we made in step 3.1. You can review the parameters used in the table below.
Parameter | Description | Example |
---|---|---|
Name | The name of your data sync. | Contact Import |
Parameter | Since this is a local file upload, we also need to set a Parameter. This value will be referenced in the "path" value of the Load Metadata box | Parameter |
Source | Defines whether your source is Local (PATH), S3, or Azure. | PATH |
Path | Since this is a local upload, this is the path to your source file. In this case, it is the value that was set for the "Parameter" value, preceded by the '@' sign. | @Parameter |
Delimiter | The type of delimiter on your source file. | Since our file is a CSV, the delimiter is a comma, and we uses the ',' value. |
Text Qualifier | A text qualifier is a character used to distinguish the point at which the contents of a text field should begin and end. | ""e; |
Header Rows to Ignore | The number of records from the top of the file to ignore before the data starts (includes column header). | 1 |
Column Name | The name(s) of the source columns that you wish to sync. Note that in this example we have selected more columns than we will map, in order to show how Connections ignores unmapped data. | "First Name" "Last Name" "Email Address: "Title" "Company" |
Column Data Type | The data type that corresponds to our selected source columns. | "Text" |
Domain | The domain of your Cinchy Target table. | Sandbox |
Table | The name of your Cinchy Target table. | People |
Column Mapping Source Column | The name(s) of the source columns that you are syncing. | "Company" "Title" "First Name" |
Column Mapping Target Column | The name(s) of the target column as it maps to the specified source column. | "Company" "Title" "Name" |
Sync Key Column Reference Name | The SyncKey is used as a unique key reference when syncing the data from the data source into the Cinchy table. It is used to match data between the source and the target. This allows for updates to occur on changed records. | "Name" |
New Record Behaviour Type | This defines what will happen when new records are found in the source. | INSERT |
Dropped Record Behaviour Type | This defines what will happen when dropped records are found in the source. | DELETE |
Changed Record Behaviour Type | This defines what will happen when changed records are found in the source. | UPDATE |
3.2.3 Using the Data Sync XML
Once you have completed your Data Sync XML, navigate to the Data Sync Configurations table in Cinchy (Image 20).
In a new row, paste the Data Sync XML into the Config XML column (Image 21).
Define your group permissions in the applicable columns. In our example, we have given All Users the Admin Access.
The Name and Config Version columns will be auto populated as they values are coming from the Config XML.
Tip: Click on the below image to enlarge it.
Be sure when you are pasting into the Config XML column that you double click into the column before pasting, otherwise each line of the XML will appear as an individual record in the Data Sync Configurations table.
To execute your Data Sync you will use the CLI. If you do not have this downloaded, refer to the documentation here.
In this example we will be using the following Data Sync Commands, however, for the full list of commands click here.
Parameter | Description | Example |
---|---|---|
-s (server) | Required. The full path to the Cinchy server without the protocol (e.g. cinchy.co/Cinchy). | "pilot.cinchy.co/Training/Cinchy/" |
-u (userid) | Required. The user id to login to Cinchy that has execution access to the data sync. | "admin" |
-p (password) | Required. The password of the above User ID parameter. This can optionally be encrypted. For a walkthrough on how to use the CLI to encrypt the password, refer to the Appendix section. | "DESuEGqfffsamx55yl256hjuPYxa4ncc+5+bLkoVIFpgs0Lq6hkcU=" |
-f (feed) | Required. The name of the Data Sync Configuration as defined in Cinchy | "Contact Import" |
Launch Powershell and navigate to the Cinchy CLI directory.
Enter and execute the following into Powershell:
Once executed, navigate to your destination table to validate that your data synced correctly (Image 22).
4. Appendix
4.1 Password Encryption
To encrypt a password using Powershell, complete the following:
Launch Powershell and navigate to the Cinchy CLI directory (note, you can always type powershell in the windows explore path for the Cinchy CLI directory)
Enter the following into Powershell
.\Cinchy.CLI.exe encrypt -t "password"
Hit enter to execute the command
Copy the password (e.g. notepad, Visual Studio Code, word etc.) so you have it accessible at batch execution time
Please note, you will need to replace "password" with your specific password.
4.2 Execution Logs
The Execution Log table is a system table in Cinchy that logs the outputs of all data syncs (Image 23). You can always review the entries in this table for information on the progression of your syncs.
4.3 Execution Errors
The Execution Errors table is a system table in Cinchy that logs any errors that may occur in a data sync (Image 24). Any data sync errors will also be logged in the temp directory outlined in the data sync execution command (e.g. -d "C:\Cinchy\temp"
)
Last updated