Using PostgreSQL (PG) on the daily basis for years I found that there are some use-cases when you need to load (e.g. for a further analytics) a bunch of not well consistent records with rare type/columns number mismatches. Since PG throws exception on the first error, currently the only one solution is to preformat your data with any other tool and then load to PG. Frequently it is easier to drop certain records instead of doing such preprocessing for every data source you have.

The naive solution here is to wrap up an insert of each record into subtransaction, however, it will ruin COPY performance and burn transaction IDs (XIDs) very rapidly.

Parallel execution may help both to boost COPY performance on multi-core systems and to catch errors via running the separated worker processes.

Organization

Student

Alexey Kondratov

Mentors

  • Anastasia Lubennikova
  • Alexander Korotkov
  • alvherre
close

2017