Getting Started with Pipes
Before Signing Up...
Pipes comes with a 14-day free trial period. It starts the moment you finalize the signing up process by setting a password, so to make the most out of it, please check the following pre-requisites:
- You have a target storage (data warehouse) and the credentials for it at hand;
- You have the credentials for your data source(s) at hand;
- You can test Pipes in the next 14 calendar days.
Server Location
Before filling in the signup form, please check the offered server location in the left panel (it will be either Germany or the USA). If you want to switch to another location, you can click the provided link.
There are two things to keep in mind in connection with the server location:
- The URL for your Pipes installation will be https://us.pipes.datavirtuality.com/ if you have selected the USA as your server location, or https://eu.pipes.datavirtuality.com/ if you have selected Germany;
- The Terms and Conditions and Privacy Policy consider local laws, so our legal documents regulating the use of Pipes exist in two versions. You can read about it in more detail and check out the documents for your selected server location here.
Signing Up
To sign up, you need to enter your name, company name, and company email. You will then receive an email with a link to set your password, which will finalize the signup process and mark the start of your 14-day trial period.
Once you have signed up for the trial, you will be redirected to your Pipes page and prompted to set up your own Pipes installation (there is no need to set up everything the first time - you can easily change your connected data warehouse, add more sources, or set up more pipelines in the future). This is done in three quick and easy steps: connecting the data warehouse, setting up the data sources, and configuring data pipelines. The setup wizard will guide you through all three steps, and if you need to add more to your Pipes installation at a later point, you can do so via the relevant pages.
Pipes works on our servers, so you do not have to install or adjust anything in your own system. However, there is one pre-requisite: if you are using a firewall, please remember to add the relevant IP address to grant access to Pipes. Here are the IP addresses used by Pipes:
- US 3.211.212.103 if hostname = us.pipes.datavirtuality.com
- EU 35.157.22.113 if hostname = eu.pipes.datavirtuality.com
Connecting Data Warehouse
You can connect your data warehouse by going to the Connect your Data Warehouse page (also available under Preferences -> Data Warehouse) and clicking on the button with your data warehouse logo. The exact connection steps depend on the data warehouse itself: for some, you will have to enter settings manually, and for some, you might need to authenticate on the data warehouse's authorization page. For your convenience and ease of use, we provide the settings with a detailed help section with information relevant to the data warehouse.
Most data warehouses have two groups of settings: basic settings and advanced settings. An exception is Google BigQuery which has an authorization button instead of basic settings.
Basic Settings
Basic settings are, as a rule, mandatory fields. Where possible, they have been pre-configured for your convenience, but you can change the values as you like; all fields are editable.
Advanced Settings
Some data warehouses have advanced settings for special, optional, or advanced properties. These are pre-configured, but you can change the indicated values if needed.
Connecting Data Sources
All data sources supported by Pipes are listed on the Sources tab, and you can find the data source you need in the list or use the search bar (this may be quicker and easier!).
All data sources require authorization. The exact process depends on the data source: for some, a special token is required, while others need you to authenticate via OAuth. We provide a detailed help section for each data source to make it easier.
Configuring Data Pipelines
1. Source
In the first step, you will be asked to select the source for the new pipeline from the list of available data sources below (these are the data source(s) you have added previously). Select the source you need and click Next.
2. Data Element
Now you need to select the data to replicate from the templates listed in the table below. For some templates, you just need to select the one you need and click Next, and for some, you will be asked to specify your table name. Some templates also support two types of replication: full (default) and incremental (applied if you disable the default option). If you stick to full replication, your data will be fully replicated each time. If you opt for incremental replication, Pipes will compare the already replicated tables with source tables and only replicate data which has appeared since the last run.
3. Pipe Properties
Next, the page will ask you to check the preview for the data element (it is limited to 20 records for reasons of speed), give a name to the resulting table in your storage (please note that the name may contain only letters, digits, and underscores), and select the schedule for data replication.
At present, there are six schedule options here: once every hour, once every 4 hours, every morning at 6:00 AM, every night at 12:00 AM, every Monday at 6:00 AM, and every Sunday at 6:00 AM (if you need to define a different schedule, you can do so later - see below!). By default, replication will start immediately, but you can disable this setting.
When you have selected the schedule, click Create now. You are all set!
Advanced Schedule Settings
If you need to set up a custom schedule for a specific pipeline (for example, every Friday at 11:00 PM), you can do this by going to the Pipes page, finding the pipeline you need, clicking the button with three dots on the right, and selecting Schedules. You will then be able to view the existing schedule(s) for this pipeline, delete it if needed, and set the desired custom schedule: daily, weekly (on the day of your choice), monthly (also on the day(s) of your choice), or every 30 minutes/1 hour/2 hours/4 hours. Defining several schedules for one pipe is also possible.
Manual Execution
Regardless of the schedule defined for a pipeline, you can execute replication at any time if needed: just go to the Pipes page, find the pipeline you need, and click the > button on the left.