Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

About this user

Fyodor Kupolov http://j2ee.by.ru

« Newer Snippets
Older Snippets »
Showing 1-10 of 12 total  RSS 

Using JavaScript for ETL transformations

Scriptella provides a simple way to perform various transformations in JavaScript (or other scripting language which have a corresponding driver).
Our example transformation consists of 3 steps:
1) Select rows from source table.
2) Transform a column value from number to text
3) Insert a transformed value into a destination table.


<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="db" driver="auto" url="jdbc:hsqldb:mem:tst" user="sa" password="" classpath="../lib/hsqldb.jar"/>
    <connection id="js" driver="script"/> 
    <connection id="log" driver="text"/> <!-- For printing debug information on the console -->

    <script connection-id="db">
        CREATE TABLE Table_In (
            Error_Code INT
        );
        CREATE TABLE Table_Out (
            Error VARCHAR(10)
        );
        
        INSERT INTO Table_IN VALUES (1);
        INSERT INTO Table_IN VALUES (7);
    </script>

    <query connection-id="db">
        SELECT * FROM Table_In
        <script connection-id="log">
            Transforming $Error_Code
        </script>
        <!-- Transformation is described as an enclosing query
         which is executed before nested elements -->
        <query connection-id="js"> 
            <![CDATA[
               if (Error_Code < 5) {
                    Error_Code='WARNING'; //Set a transformed value
               } else {
                    Error_Code='ERROR'; //Set a transformed value
               }
               query.next(); //Don't forget to trigger nested scripts execution
            ]]>
            <script connection-id="db">
                <!-- Insert transformed value -->
                INSERT INTO Table_Out VALUES (?Error_Code); 
            </script>
            <script connection-id="log">
                Transformed to $Error_Code
            </script>
        </query>
    </query>
</etl>

How to execute Scriptella ETL files

Scriptella ETL provides several ways to execute ETL files:

Invocation from Ant
<taskdef resource="antscriptella.properties" classpath="/path/to/scriptella.jar[;additional_drivers.jar]"/>
<etl file="path/to/etl/file/> <!-- Execute ETL file from specified location -->

Command-Line Execution
Just type scriptella to run the file named etl.xml in the current directory. Alternatively you can use Java launcher:
java -jar scriptella.jar [arguments]

Executing ETL Files from Java
It is extremely easy to run Scriptella ETL files from java code. Just make sure scriptella.jar is on classpath and use any of the following methods to execute an ETL file:
EtlExecutor.newExecutor(new File("etl.xml")).execute(); //Execute etl.xml file
EtlExecutor.newExecutor(getClass().getResource("etl.xml")).execute(); //Execute etl.xml file loaded from classpath
EtlExecutor.newExecutor(
    servletContext.getResource("/WEB-INF/etl.xml")).execute(); //Execute etl.xml file from web application WEB-INF dir

Integration with Spring Framework
<beans>
    <!-- Spring beans declarations -->

    <!-- Spring managed bean which executes etl.xml file -->
    <bean id="executor" class="scriptella.driver.spring.EtlExecutorBean">
        <property name="configLocation" value="etl.xml"/>
    </bean>
</beans>

The usage of executor is straightforward:
EtlExecutor exec = (EtlExecutor) beanFactory.getBean("executor");
exec.execute();

See Spring Driver JavaDoc for additional details.

How to produce daily database table dumps in CSV format

The following code demonstrates how to produce CSV files with dynamic
file name pattern based on a current day. The produced files have the following naming format:
TABLE_NAME_MM_DD_YYYY.csv

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <properties> <!-- Configure table name -->
        table_name=test
    </properties>
    <connection id="in" driver="auto" url="jdbc:oracle:thin:@localhost:1521:ORCL" 
      classpath="ojdbc14.jar" user="scott" password="tiger"/>
    <connection id="out" driver="csv" url="${table_name}_${etl.date.now('MM_dd_yyyy')}.csv" />
    <query connection-id="in"> <!-- Query table rows -->
        SELECT * FROM ${table_name}
        <script connection-id="out"> <!-- Export each row into a CSV -->
            $ID, $Name, $Surname <!-- Use column names from selected table -->
        </script>
    </query>
</etl>

Use Scriptella ETL to run the example.

See How to execute an ETL file from command line, Ant or directly from Java .

Splitting large Scriptella ETL files

The following example demonstrates how to split a large Scriptella ETL file into several parts. This example is based on a traditional XML parsed entities approach:

<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd"
[
    <!-- Declaring the first external parsed entity to include -->
    <!ENTITY part1 SYSTEM "part1.xml">
    
    <!-- Declaring the second external parsed entity to include -->
    <!ENTITY part2 SYSTEM "part2.xml">
]>
<etl>
    <connection driver="text"/>

    <!-- Including file #1 -->
    &part1;

    <script>
        content of the script
    </script>
    
    <!-- Including file #2 -->
    &part2;

</etl>

List of Countries and ISO codes Supported by the JVM.

The following code displays country names and ISO-codes for the locales supported by the JVM.

        for (Locale locale : Locale.getAvailableLocales()) {
            final String contry = locale.getDisplayCountry();
            if (contry.length() > 0) {
                System.out.println("Country = " + contry + ". ISO code: " + locale.getISO3Country());
            }
        }

Package Name as a Constant

getClass().getPackage().getName() is not a reliable solution, because getPackage() can be null depending on a class loader.
static final String PACKAGE_NAME = MyClass.class.getName().substring(0, 
            MyClass.class.getName().lastIndexOf('.'));

Importing XML into a database with Scriptella ETL

The following Scriptella ETL simple usage example imports RSS file into a database table.
<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="in" driver="xpath" url="http://snippets.dzone.com/rss"/>
    <connection id="db" driver="hsqldb" url="jdbc:hsqldb:db/rss" user="sa" classpath="hsqldb.jar"/>
classpath="hsqldb.jar"/>
    <query connection-id="in">
        /rss/channel/item
        <script connection-id="db">
            INSERT INTO Rss (ID, Title, Description, Link) 
            VALUES (?rownum, ?title, ?description, ?link);
        </script>
    </query>
</etl>


Here is the full version of the example described above. It creates an RSS table, downloads rss file, inserts rss records into a database, converts rss.xml to a plain text file and saves it to rss.txt.
<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
    <connection id="in" driver="xpath" url="http://snippets.dzone.com/rss"/>
    <connection id="out" driver="text" url="rss.txt"/>
    <connection id="db" driver="hsqldb" url="jdbc:hsqldb:db/rss" user="sa" classpath="hsqldb.jar"/>
    <script connection-id="db">
       CREATE TABLE Rss (
           ID Integer,
           Title VARCHAR(255),
           Description VARCHAR(255),   
           Link VARCHAR(255)

       )
    </script>
    <query connection-id="in">
        /rss/channel/item
        <script connection-id="out">
            Title: $title
            Description: [
            ${description.substring(0, 20)}...
            ]
            Link: $link
            ----------------------------------
        </script>
        <script connection-id="db">
            INSERT INTO Rss (ID, Title, Description, Link) 
            VALUES (?rownum, ?title, ?description, ?link);
        </script>
    </query>
</etl>

Script to insert BLOB from file into a database

Scriptella ETL allows inserting files into a database. This is achieved by a simple bind variables extension syntax ?{file ...}.
The following sample initializes table of music tracks. Each track has a DATA field containing a file loaded from an external location. File song1.mp3 is stored in the same directory as etl.xml and song2.mp3 is loaded from the web:
    <!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
    <etl>
        <connection driver="hsqldb" url="jdbc:hsqldb:file:tracks" user="sa" classpath="hsqldb.jar"/>
        <script>
            CREATE TABLE Track (
              ID INT,
              ALBUM_ID INT,
              NAME VARCHAR(100),
              DATA LONGVARBINARY
            );
            <!-- Inserts file with path relative to ETL script location -->
            INSERT INTO Track(id, album_id, name, data) VALUES
                   (1, 1, 'Song1.mp3', ?{file 'song1.mp3'});
            <!-- Inserts file from an external URL-->
            INSERT INTO Track(id, album_id, name, data) VALUES
                   (2, 2, 'Song2.mp3', ?{file 'http://musicstoresample.com/song2.mp3'});
        </script>
    </etl>

Copy table from one database to another

This Scriptella ETL script copies all rows from Src_Table to Dest_Table.
Src_Table contains the following columns: id, first_name, last_name
Dest_Table contains the following columns: id, name
The name column of the Dest_Table is produced by a concatenation of first_name and last_name from the Src_Table
This example demonstrates HSQLDB-To-Oracle copy procedure, although it works between virtually any databases.
<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
  <connection id="in" driver="hsqldb" url="jdbc:hsqldb:file:demo" 
              classpath="hsqldb.jar" user="sa"/>
  <connection id="out" driver="oracle" url="jdbc:oracle:thin:@localhost:1521:ORCL" 
              classpath="ojdbc14.jar" user="scott" password="tiger"/>
  <!-- Copy all table rows from one to another database -->
  <query connection-id="in">
      SELECT * FROM Src_Table --Selects all rows
      <!-- For each row executes insert -->  
      <script connection-id="out"> 
          INSERT INTO Dest_Table(ID, Name) 
          VALUES (?id,?{first_name+' '+last_name})
      </script>
  </query>
</etl>

Load CSV data into a database (Scriptella ETL tool)

This example demonstrates usage of Scriptella ETL Tool to load CSV data into a database table.

Input CSV file data.csv:
id,priority,summary,status
1,Critical,NullPointerException in Main class,Open
5,Low,"Checkstyle, PMD, Findbugs issues",Reopened
7,Low,Maven integration,Open
10,High,SPI API,Closed

The CSV loading script has the following content:
<!DOCTYPE etl SYSTEM "http://scriptella.javaforge.com/dtd/etl.dtd">
<etl>
  <connection id="in" driver="csv" url="data.csv"/>
  <connection id="out" driver="oracle" url="jdbc:oracle:thin:@localhost:1521:ORCL" 
      classpath="ojdbc14.jar" user="scott" password="tiger"/>
  <!-- Copy all CSV rows to a database table -->
  <query connection-id="in">
      <!-- Empty query means select all columns -->
      <script connection-id="out">
          INSERT INTO Table_Name VALUES (?id,?priority, ?summary, ?status)
      </script>
  </query>
</etl>


Use RegEx to filter CSV data:
<query connection-id="in">
    <!--Select bugs with status open or reopened.-->
    ,,,open|reopened
    <!--Inserts imported rows into a database-->
    <script connection-id="out">
       INSERT INTO Table_Name VALUES (?id, ?priority, ?summary, ?status);
    </script>
</query>

« Newer Snippets
Older Snippets »
Showing 1-10 of 12 total  RSS