database - DB bulk validation and upload -


i designing application involve bulk upload of records postgres db (lets call schema db-1). uploads done every week. size range few million billion records. data going uploaded needs validated/cleansed first need conform constaints , format of db-1. thinking of adopting following approach:

  1. everytime new upload needs done, new schema created (lets call db-2 - staging place) same db-1 but lenient constraints. make sure data gets loaded in db-2 start with.
  2. run validation process on data. thinking middleware process when realized amount of data processed, kind of started thinking coding validation+cleansing layer in db - set of stored procs run on db-2, check data , generate report records not conform rules (ie constaints present in db-1, data format etc).
  3. after this, data needs changed again @ source, step 1 repeated , if looks ok, select db-1 db-2 shift valid data final desitnation.

what opinion on above process? obvious/hidden issues see here? suggestions make better welcome.

thanks

j


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - .htaccess mod_rewrite for dynamic url which has domain names -

Website Login Issue developed in magento -