|Filtering Files and Tips|
The URL File Dialog serves to navigate through the file system and select input or output files.
In many components, you are asked to specify the URL of some files. These files can serve to locate the sources of data that should be read, the sources to which data should be written or the files that must be used to transform data flowing through a component and some other file URL. To specify the URL of such a file, you can use the URL File Dialog.
The URL File Dialog has several tabs on it.
Figure 25.1. URL File Dialog
The Local files tab serves to locate files on a local file system. The combo contains local file system places and parameters. It can be used to specify both CloverETL projects and any other local files.
Figure 25.2. URL File Dialog - Local files
Best practice is to specify the path to files with Workspace view instead of Local view. Workspace view with help of parameters provides you with better portability of your graphs.
Workspace view tab serves to locate files in a workspace of a local CloverETL project.
Figure 25.3. URL File Dialog - Workspace view
Clover Server serves to locate files of all opened CloverETL Server projects. Available only for CloverETL Server projects.
Figure 25.4. URL File Dialog - Clover Server
Hadoop HDFS tab serves to locate files on Hadoop Distributed File System.
Figure 25.5. URL File Dialog - Hadoop HDFS
You need a working Hadoop Connection to choose the particular files.
The Remote files tab serves to locate files on a remote computer or on the Internet. You can specify properties of connection, proxy settings, and HTTP properties.
You can type the URL directly in the format described in Supported File URL Formats for Readers or Supported File URL Formats for Writers, or you can specify it with a help of Edit URL Dialog. The Edit URL Dialog is accessible under the globe icon.
Figure 25.6. Edit URL Dialog
Edit URL Dialog lets you specify connection to a remote server in pleasant way. Choose the protocol, specify a host name, port, credentials, and path.
The dialog lets you specify the connection using the following protocols:
SFTP - FTP over SSH
WebDav over SSL
Windows Share - SMB1/CIFS
Windows Share - SMB 2.x, SMB 3.x
Clickto save the connection settings. Click to use it.
Figure 25.7. Edit URL Dialog
The Load button serves to load a session from the list for subsequent editing.
The Delete button serves to delete the session from the list.
If the protocol is HTTP, HTTPS, FTP, SFTP - FTP over SSH, WebDav, WebDav over SSL, Windows Share - SMB1/CIFS or Windows Share - SMB 2.x or 3.x, the dialog allows you to specify the host name, port, username, password, and path on the server. It allows you to connect anonymously, as well.
If you are reading from or writing into remote files and are connected via an SFTP protocol using a certificate-based authorization, you should:
Create a directory named
ssh-keys in your project;
Put the private key files into this directory and choose a suitable filename
Listed in order from the highest to lowest priority when resolving, the private key file can have the following names:
*.key (the files are resolved in alphabetical order).
If you want to explicitly select a certificate for a specific location, the best way is to
use the name with the highest priority, i.e.
Figure below shows the format of the OpenSSH private key generated by
Figure 25.8. Example of Generated OpenSSH Private Key
CloverETL is able to connect to FTP proxy using the following URL syntax:
In the case of the Amazon S3 protocol, the dialog allows you to fill in access Key, secret key, bucket, and path. For better performance, you should fill in the corresponding region.
Figure 25.9. Edit URL Dialog - Amazon S3
Having the connection specified, you can choose the particular file(s).
Amazon S3 URL
It is recommended to connect to S3 via endpoint-specific S3 URL:
The end-point in URL should be the end-point corresponding to the bucket.
The URL with a specific endpoint has a much better performance than the generic one
but you can only access the buckets of the specific region.
The endpoint affects the signature version that will be used. If you connect to the generic one, the signature version may not match the endpoint being used. Therefore the signature is sent twice and you can see an error message in the error log:
DEBUG [main] - Received error response: com.amazonaws.services.s3.model.AmazonS3Exception: The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256. (Service: null; Status Code: 400; Error Code: InvalidRequest; Request ID: 2D7C4933BD5ED2F8), S3 Extended Request ID: 9wmejqgrZ0jRpgqvw43RXUBZOzm9rnd5/wVN19kSe0dHAF/k5rxq34jvRhy8bHd5JnqBcQTBwkM=
WARN [main] - Attempting to re-send the request to cloveretl.example.test.s3.eu-central-1.amazonaws.com with AWS V4 authentication. To avoid this warning in the future, please use region-specific endpoint to access buckets located in regions that require V4 signing.
For list of regions and endpoints, see AWS Regions and Endpoints (Amazon S3).
When the S3 URL does not contain Secret Key + Access Key
CloverETL automatically searches for credentials in the following sources
(in this order):
Recommended since they are recognized by all the AWS SDKs and CLI except for .NET
only recognized by Java SDK
Java System Properties
Credential profiles file at the default location
shared by all AWS SDKs and the AWS CLI
Credentials delivered through the Amazon EC2 container service
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variable must be set
and the security manager must have permission to access the variable
Instance profile credentials delivered through the Amazon EC2 metadata service
For detailed information, see the official AWS documentation.
These sources of credentials may be used for graph development in a local project;
for example, set
Figure 25.10. URL File Dialog - Remote files
Serves to specify fields and processing type for port reading or writing. Opens only in components that allow such data source or target.
Figure 25.11. URL File Dialog - Input Port
Figure 25.12. URL File Dialog - Output Port
Dictionary tab serves to specify dictionary key value and processing type for dictionary reading or writing. Opens only in components that allow such data source or target.
Figure 25.13. URL File Dialog - Dictionary
See also: Using a Dictionary in Graphs
If you use File URL Dialog configured to display only some files according to the extension, you can see the File Extension below File URL.
Figure 25.14. Configured URL File Dialog
To ensure graph portability, forward slashes are used for defining the path in URLs (even on Microsoft Windows).
The New Directory action is available at the toolbar of Workspace View and the Local Files tab. F7 key can be used as a shortcut for the action. Newly created directory is selected at the dialog and its name can be edited in-line. Press F2 to rename the directory and DEL to delete it.
More detailed information of URLs for each of the tabs described above is provided in sections