using sstableloader
sstableloader tool can be used to load data in to Cassandra cluster in batch wise. First we need to generate sstable by reading the external data resource (in this example below from .csv file) and sstableloader stream sstable data files to Cassandra cluster confirming the replication stragegy of the cluster . Following is as example code for loading data from .csv files to Cassandra.
1. Download and Install Cassandra from here
and use Cassandra GUI to monitor your cluster
2. Lets consider a csv file with fields uuid, firstname, lastname, password, age, email
5bd8c586-ae44-11e0-97b8-0026b0ea8cd0, Alice, Smith, asmi1975, 32, alice.smith@mail.com
4bd8cb58-ae44-12e0-a2b8-0026b0ed9cd1, Bob, Miller, af3!df8, 28, bob.miller@mail.com
1ce7cb58-ae44-12e0-a2b8-0026b0ad21ab, Carol, White, cw1845?, 49, c.white@mail.com
Lets consider creating one column family 'Users' with this csv. So create the keyspace 'Demo' and column family 'Users'.
create keyspace Demo;
use Demo;
create column family Users
with key_validation_class=LexicalUUIDType
and comparator=AsciiType
and column_metadata=[
{ column_name: 'firstname', validation_class: AsciiType }
{ column_name: 'lastname', validation_class: AsciiType }
{ column_name: 'password', validation_class: AsciiType }
{ column_name: 'age', validation_class: LongType, index_type: KEYS }
{ column_name: 'email', validation_class: AsciiType }];
3. Create sstable using SSTableSimpleUnsortedWriter class
configuration - To compile this file the Cassandra jar (>= 0.8.2) needs to be in the classpath (javac -cp <path_to>/apache-cassandra-1.1.1.jar DataImportExample.java). To run it, the Cassandra jar needs to be present as well as the jar of the librairies used by Cassandra (those in the lib/ directory of Cassandra source tree). Validcassandra.yaml and log4j configuration files should also be accessible; typically, this means the conf/directory of the Cassandra source tree should be in the classpath.
- If you are using Eclipse IDE add the Cassandra/conf folder path to classpath-->Advanced settings-->Add External Folder--> then add apache-cassandra-x.x.x/conf folder in to path.
- In Intellij Idea , got to Run-->Edit Configurations-->Application-->Configuration-->VM Options. There you should give the path to cassandra.yaml as follows.
-Dcassandra-foreground -Dcassandra.config=file:///<path to/apache-cassandra-1.1.2/conf/cassandra.yaml> -ea -Xmx1G
After you run this you will see a folder called Demo is created and there you can find sstables created with .db and .sh1 files.
for an example in the above case it will create sstables as below (when keySpace="Demo" and Column family = "Users")
Demo-Users-hd-1-Data.db
|
Demo-Users-hd-1-Digest.sha1
|
Demo-Users-hd-1-Filter.db
|
Demo-Users-hd-1.index.db
|
Demo-Users-hd-1.Statistics.db
|
4. Loading sstables to Cassendra using sstableLoader.
To load sstables from command line
Go to
CASSANDRA_HOME/bin then , run this command , ./sstableloader Demo or
- Go to CASSANDRA_HOME and run command, bin/sstableloader/some/path/to/the<KeyspaceName>
- If you are running the all from localhost try the following steps,
- Set another loop-back address with command(linix) : sudo ifconfig lo:0 127.0.0.2 netmask 255.0.0.0 up
- Get a copy of the running Cassandra folder and do the following configuratons.
Set the rpc address and listen address in /conf/casandra.yaml to 127.0.0.2. if you want to listen all interfaces you can set rpc address to 0.0.0.0
- run sstableloader from the copied Cassandra with using command
./sstableloader -d 127.0.0.2 <path to sstable../keyspace_name/columnfamily_name>
Relevant to above example the path should be given as ./sstableloader -d 127.0.0.2 pathtosstables/Demo/Users
To run sstableloader from java module
If you want to run sstableLoader through your java module you can run following class. " org.apache.cassandra.tools.BulkLoader.main(args) ".Where "args" will be array of values you supply during the run of sstableloader from command line.
load sstables using JMX client
with my personal experience using JMX client is much faster than using sstableloader