- Renaming a RAC cluster - 27/09/2018
- Stop/Start all RAC databases at once - 26/09/2018
- RMAN Backup script - 08/11/2017
I recently came across an issue while exporting a huge partitioned table for a data migration. The export took years without any obvious reason.
After some research I found that the parallelism wasn’t working, I could only see one datapump worker at a time, then I found this note explaining the behaviour (Doc ID 1467662.1) .
A table with BASICFILE lobs cannot be exported in parallel, oracle advise to convert those BASICFILES to SECUREFILES; this was not possible for me.
The second solution was to run multiple export jobs in parallel, each one exporting a fraction of the table rows. In the example, oracle use a query using a range of the primary key to do an export, this was ok but I needed a more automatic method to divide the rows based on the degree of parallelism I wanted to apply, then I found this blog post from Morten Jensen.
He uses the modulo and the rowid to create an equal division of the rows based on a parallelism parameter, that was exactly what I was looking for. He used a perl script to achieve that, I preferred to make a shell script to do it because I have only a little knowledge of Perl. I created one shell script to create the export, in my case it was to export a specific partition but the script supports tables as well, and one script to import the dumpfiles generated in parallel.
Datapump in parallel scripts
#!/bin/bash ############################################################################# # EXPORT IN PARALLEL WITH DATAPUMP ############################################################################# # # Description: This script will create multiple export job in parallel # to export table # ############################################################################# # # Author: Cyrille MODIANO # # Version : 1.0 (2016-10-25) # Version : 1.1 (2016-11-30) - Fixed mod = 0 values skipped during export # ############################################################################# BASE_SCHEMA=$1 BASE_TABLE=$2 PARALLEL=$3; PARTITION=$4 function usage(){ echo "USAGE: Parameter 1 is the SCHEMA Parameter 2 is the TABLE NAME Parameter 3 is the DEGREE of parallelism Parameter 4 (optional) is the partition (if any)" } if [ $# -lt 3 ]; then usage exit 1 fi if [ $# -eq 4 ]; then PARFILE=${BASE_SCHEMA}_${BASE_TABLE}_${PARTITION}.par echo "tables=${BASE_SCHEMA}.${BASE_TABLE}:${PARTITION}" > $PARFILE START_MESSAGE="Beginning export of partition : ${BASE_SCHEMA}.${BASE_TABLE}:${PARTITION} " END_MESSAGE "Finished export of partition: ${BASE_SCHEMA}.${BASE_TABLE}:${PARTITION}" DUMPFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE}_${PARTITION} LOGFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE}_${PARTITION} else PARFILE=${BASE_SCHEMA}_${BASE_TABLE}.par echo "tables=${BASE_SCHEMA}.${BASE_TABLE}" > $PARFILE START_MESSAGE="# Beginning export of table : ${BASE_SCHEMA}.${BASE_TABLE}" END_MESSAGE "# Finished export of table: ${BASE_SCHEMA}.${BASE_TABLE}" DUMPFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE} LOGFILE_BASE=${BASE_SCHEMA}_${BASE_TABLE} fi # Adding parameters to the parfile echo "directory=DATA_PUMP" >> $PARFILE echo "EXCLUDE=STATISTICS" >> $PARFILE echo "CLUSTER=N" >> $PARFILE echo "#########################################################################################" echo $START_MESSAGE echo "#########################################################################################" echo " " LIMIT=$(expr $PARALLEL - 1) START_TIME=`date` for i in `seq 0 $LIMIT` do QUERY="where mod(dbms_rowid.rowid_block_number(rowid), ${PARALLEL}) = $i" expdp userid=\'/ as sysdba\' query=$BASE_SCHEMA.$BASE_TABLE:\"$QUERY\" dumpfile=${DUMPFILE_BASE}_${i}.dmp logfile=${LOGFILE_BASE}_${i}.log parfile=$PARFILE & sleep 3 done wait `pidof expdp` echo "#########################################################################################" echo $END_MESSAGE echo "# Start time : $START_TIME " echo "# End time is: `date`" echo "#########################################################################################"
and the import script:
#!/bin/bash ############################################################################# # IMPORT IN PARALLEL WITH DATAPUMP ############################################################################# # # Description: This script will import multiple dumpfiles in parallel # ############################################################################# # # Author: Cyrille MODIANO # # Version : 1.0 (2016-10-25) # Version : 1.1 (2016-12-12) - added support for working directories and notifications # ############################################################################# export ORAENV_ASK=NO export ORACLE_SID=$1 . oraenv TABLE_NAME=$2 PARTITION=$3 function usage(){ echo "USAGE: Parameter 1 is the SID of the database where you want to import Parameter 2 is the TABLE you want to import Parameter 3 (optional) is the PARTITION name you want to import (if any)" } if [ $# -lt 2 ]; then usage exit 1 fi if [ $# -eq 3 ]; then PARFILE=${TABLE_NAME}_${PARTITION}.par START_MESSAGE="Beginning import of partition : ${TABLE_NAME}:${PARTITION} " END_MESSAGE "Finished import of partition: ${TABLE_NAME}:${PARTITION}" SEARCH_PATTERN=${BASE_TABLE}_${PARTITION} SUCCESS_MESSAGE="partition: ${TABLE_NAME}:${PARTITION} successfully imported, started at" ERROR_MESSAGE="partition: ${TABLE_NAME}:${PARTITION} failed to import, check logfile for more info" MAIL_OBJECT="Successfully imported partition ${TABLE_NAME}:${PARTITION}" else PARFILE=${TABLE_NAME}.par START_MESSAGE="Beginning import of table : ${TABLE_NAME}" END_MESSAGE "Finished import of table : ${TABLE_NAME}" SEARCH_PATTERN=${BASE_TABLE} SUCCESS_MESSAGE="Table ${TABLE_NAME} successfully imported, started at " ERROR_MESSAGE="Table ${TABLE_NAME} failed to import, check logfile for more info" MAIL_OBJECT="Successfully imported table ${TABLE_NAME}" fi #directories BASEDIR=/u10/ DUMPDIR=$BASEDIR/DUMP PARFILEDIR=$BASEDIR/parfiles mkdir -p $PARFILEDIR # building the parfile echo "DIRECTORY=MY_DUMP_DIR" > ${PARFILEDIR}/$PARFILE echo "CLUSTER=N" >> ${PARFILEDIR}/$PARFILE echo "TABLE_EXISTS_ACTION=APPEND" >> ${PARFILEDIR}/$PARFILE echo "DATA_OPTIONS=DISABLE_APPEND_HINT" >> ${PARFILEDIR}/$PARFILE echo "#########################################################################################" echo $START_MESSAGE echo "#########################################################################################" echo " " START_TIME=`date` for dump in `ls ${DUMPDIR}/*${SEARCH_PATTERN}*.dmp` do DUMPFILE=${dump} LOGFILE=imp_${dump}.log impdp userid=\'/ as sysdba\' dumpfile=$DUMPFILE logfile=${LOGFILE} parfile=${PARFILEDIR}/$PARFILE & sleep 3 done wait `pidof impdp` echo "#########################################################################################" echo $END_MESSAGE echo "# Start time : $START_TIME " echo "# End time : `date`" echo "#########################################################################################" # Verifying errors errors_count=`grep ORA- *${SEARCH_PATTERN}*.log | wc -l` if [ $errors_count -eq 0 ]; then echo "$SUCCESS_MESSAGE $START_TIME and finished at `date`" | mail -s $MAIL_OBJECT you@your-email.com else echo $ERROR_MESSAGE | mail -s $MAIL_OBJECT you@your-email.com fi
This solution is a workaround for the issue mentioned previously but it can also be used in standard edition to use parallelism with datapump which is only accessible to EE users otherwise.
To use these scripts you will have to change the DIRECTORY to match the one you are using.
You can also customize the parfile options, on my side I used the following:
- TRANSFORM=LOB_STORAGE:SECUREFILE (to transform BASICFILES to SECUREFILES while importing)
- TABLE_EXISTS_ACTION=APPEND (because I was importing partitions and the table existed already)
- DATA_OPTIONS=DISABLE_APPEND_HINT (this is mandatory when using multiple jobs importing in the same table at the same time as mentioned in the oracle note)
Thanks for reading.
Thanks Cyrille for sharing this post.
Foued
You’re welcome Foued.
hello very interesting..however am having issues to get it to work..can you post some script usage please
Hi,
Usage is at the beginning of the script, what issue are you having?
Thanks
Well done Cyrille & Morten. MOD is magic!
Thanks Vincent
Hi, how to do the connection string in a container/pluggable database. Assume the plugggable dbs is called mypdb.
The connect sring “/ as sysdba”@mypdb : this syntac does not work.
The connect sring “/@mypdb as sysdba” : this syntac does not work.
It will work only if I put explicitly a user not the “/”
The connect sring “system/password@mypdb as sysdba” : this syntac will work.
But I don’t want to use neither sys nor system nor any other user. I need to use “/ as sysdba”, how to make work with “pluggable names string”
Thanks in advance
Hi,
For what I know, you can’t connect without providing a password to a pdb service.
Thanks
Cyrille great job…can you take a schema backup in parallel mode?And if yes what is the command
I have tried :
Script_name.sh schema_name 5
Not working :)… am on SE
Hi,
This script was made to export a table or a table partition but it’s possible to adapt it to export a whole schema, you will to add a step to loop through the different tables.
Thanks
Your code is too sophisticated for me too add a single line.. I will be most greatful if you could change the code…I would be really really appreciative
I was able to complete the schema export using Pl/Sql