i'm working on system mirrors remote datasets using initials , deltas. when initial comes in, mass deletes preexisting , mass inserts fresh data. when delta comes in, system bunch of work translate updates, inserts, , deletes. initials , deltas processed inside long transactions maintain data integrity.
unfortunately current solution isn't scaling well. transactions large , long running our rdbms bogs down various contention problems. also, there isn't audit trail how deltas applied, making difficult troubleshoot issues causing local , remote versions of dataset out of sync.
one idea not run initials , deltas in transactions @ all, , instead attach version number each record indicating delta or initial came from. once initial or delta loaded, application can alerted new version of dataset available.
this leaves issue of how compose view of dataset given version initial , deltas. (apple's timemachine similar, using hard links on file system create "view" of point in time.)
does have experience solving kind of problem or implementing particular solution?
thanks!
have 1 writer , several reader databases. send write 1 database, , have propagate exact same changes other databases. reader databases consistent , time update fast. have seen done in environments upwards of 1m page views per day. scalable. can put hardware router in front of read databases load balance them.
Comments
Post a Comment