i have csv file quite large (few hundred mbs) trying import postgres table, problem arise when there, primary key violation (duplicate record in csv file)
if has been 1 manually filter out records, these files generated program generate such data every hour. script has automatically import database.
my question is: there way out can set flag in copy command or in postgres can skip duplicate records , continue importing file table?
my thought approach in 2 ways:
- use utility can create "exception report" of duplicate rows, such one during copy process.
- change workflow loading data temp table first, massaging duplicates (maybe join target table , mark existing in temp dup flag), , import missing records , send dups exception table.
i prefer second approach, that's matter of specific workflow in case.
Comments
Post a Comment