Examples¶
The following are some examples of using the Nsync functionality. The following Django models will be used as the target models:
class Person(models.Model):
first_name = models.CharField(
blank=False,
max_length=50,
verbose_name='First Name'
)
last_name = models.CharField(
blank=False,
max_length=50,
verbose_name='Last Name'
)
age = models.IntegerField(blank=True, null=True)
hair_colour = models.CharField(
blank=False,
max_length=50,
default="Unknown")
class House(models.Model):
address = models.CharField(max_length=100)
country = models.CharField(max_length=100, blank=True)
floors = models.IntegerField(blank=True, null=True)
owner = models.ForeignKey(TestPerson, blank=True, null=True)
Example - Basic¶
Using this file:
action_flags | match_on | first_name | last_name | employee_id |
---|---|---|---|---|
cu | employee_id | Andrew | Dodd | EMP1111 |
d* | employee_id | Some | Other-Guy | EMP2222 |
cu | employee_id | Captain | Planet | EMP3333 |
u* | employee_id | Batman | EMP1234 |
And running this command:
> python manage.py syncfile TestSystem myapp Person persons.csv
- Would:
- Create and/or update
myapp.Person
objects with Employee Ids EMP1111 & EMP3333. However, it would not update the two name fields if the objects already existed with non-blank fields. - Delete any
myapp.Person
objects with Employee Id EMP2222. - If a person with Employee Id EMP1234 exists, then it will forcibly update the name fields to ‘C.’ and ‘Batman’ respectively.
- Create and/or update
- NB it would also:
- Create an
nsync.ExternalSystem
object with the name TestSystem, as the default is to create missing external systems. However, because there are noexternal_key
values, noExternalKeyMapping
objects would be created.
- Create an
Example - Basic with External Ids¶
Using this file:
external_key | action_flags | match_on | first_name | last_name | employee_id |
---|---|---|---|---|---|
12212281 | cu | employee_id | Andrew | Dodd | EMP1111 |
43719289 | d* | employee_id | Some | Other-Guy | EMP2222 |
99999999 | cu | employee_id | Captain | Planet | EMP3333 |
11235813 | u* | employee_id | Batman | EMP1234 |
And running this command:
> python manage.py syncfile TestSystem myapp Person persons.csv
- Would:
- Perform all of steps in the ‘Plain file’ example
- Delete any
ExternalKeyMapping
objects that are for the ‘TestSystem’ and have the external key ‘43719289’ (i.e. the record for Some Other-Guy). - Create or update
ExternalKeyMapping
objects for each of the other threemyapp.Person
objects, which contain theexternal_key
value.
Example - Basic with multiple match fields¶
Sometimes you might not have a ‘unique’ field to find your objects with (like ‘Employee Id’). In this instance, you can specify multiple fields for finding your object (separated with a space, ‘ ‘).
For example, using this file:
action_flags | match_on | first_name | last_name | age |
---|---|---|---|---|
cu* | first_name last_name | Michael | Martin | 30 |
cu* | first_name last_name | Martin | Martin | 40 |
cu* | first_name last_name | Michael | Michael | 50 |
cu* | first_name last_name | Martin | Michael | 60 |
And running this command:
> python manage.py syncfile TestSystem myapp Person persons.csv
- Would:
- Create and/or update four persons of various “Michael” and “Martin” name combinations
- Ensure they are updated/created with the correct age!
Example - Two or more systems¶
This is probably the main purpose of this library: the ability to synchronise from multiple systems.
Perhaps we need to synchronise from two data sources on housing information, one is the ‘when built’ information and the other is the ‘renovations’ information.
As-built data:
external_key | action_flags | match_on | address | country | floors |
---|---|---|---|---|---|
111 | cu | address | 221B Baker Street | England | 1 |
222 | cu | address | Wayne Manor | Gotham City | 2 |
Renovated data:
external_key | action_flags | match_on | address | floors |
---|---|---|---|---|
ABC123 | u* | address | 221B Baker Street | 2 |
ABC456 | u* | address | Wayne Manor | 4 |
FOX123 | u* | address | 742 Evergreen Terrace | 2 |
And running this command:
> python manage.py syncfiles AsBuiltDB_myapp_House.csv RenovationsDB_myapp_House.csv
- Would:
Use the mutliple file command,
syncfiles
, to perform multiple updates in one commandCreate the two houses from the ‘AsBuilt’ file
Only update the
country
values of the two houses from the ‘AsBuilt’ file IFF the objects already existed but they did not have a value forcountry
Forcibly set the
floors
attribute for the first two houses in the ‘Renovations’ file.Create 4
ExternalKeyMapping
objects:External System Ext. Key House Object AsBuiltDB 111 212B Baker Street RenovationsDB ABC123 AsBuiltDB 222 Wayne Manor RenovationsDB ABC456 Only update the
floors
attribute for “742 Evergreen Terrace” if the house already exists (and would then also create anExternalKeyMapping
)
Example - Referential fields¶
You can also manage referential fields with Nsync. For example, if you had the following people:
external_key | action_flags | match_on | first_name | last_name | employee_id |
---|---|---|---|---|---|
1111 | cu* | employee_id | Homer | Simpson | EMP1 |
2222 | cu* | employee_id | Bruce | Wayne | EMP2 |
3333 | cu* | employee_id | John | Wayne | EMP3 |
You could set their houses with a file like this:
external_key | action_flags | match_on | address | owner=>first_name |
---|---|---|---|---|
ABC456 | cu* | address | Wayne Manor | Bruce |
FOX123 | cu* | address | 742 Evergreen Terrace | Homer |
The “=>” is used by Nsync to follow the the related field on the provided object.
Example - Referential field gotchas¶
The referential field update will ONLY be performed if the referred-to-fields target a single object. For example, if you had the following list of people:
external_key | action_flags | match_on | first_name | last_name | employee_id |
---|---|---|---|---|---|
1111 | cu* | employee_id | Homer | Simpson | EMP1 |
2222 | cu* | employee_id | Homer | The Greek | EMP2 |
3333 | cu* | employee_id | Bruce | Wayne | EMP3 |
4444 | cu* | employee_id | Bruce | Lee | EMP4 |
5555 | cu* | employee_id | John | Wayne | EMP5 |
6666 | cu* | employee_id | Marge | Simpson | EMP6 |
The owner=>first_name
from the previous example is insufficient to pick out a single person to link a house to (there are 2 Homers and 2 Bruces). Using just the employee_id
field would work, but that piece of information may not be available in the system for houses.
Nsync allows you to specify multiple fields to use in order to ‘filter’ the correct object to create the link with. In this instance, this file would perform correctly:
external_key | action_flags | match_on | address | owner=>first_name | owner=>last_name |
---|---|---|---|---|---|
ABC456 | cu* | address | Wayne Manor | Bruce | Wayne |
FOX123 | cu* | address | 742 Evergreen Terrace | Homer | Simpson |
Example - Complex Fields¶
- If you want a more complex update you can:
- Write an extension to Nsync and submit a Pull Request! OR
- Extend your Django model with a custom setter
If your Person model has a photo ImageField, then you could add a custom handler to update the photo based on a provided file path:
class Person(models.Model):
...
photo = models.ImageField(
blank = True,
null = True,
max_length = 200,
upload_to = 'person_photos',
)
...
@photo_filename.setter
def photo_filename(self, file_path):
...
Do the processing of the file to update the model
And then supply the photos with a file sync file like:
action_flags | match_on | first_name | last_name | employee_id | photo_filename |
---|---|---|---|---|---|
cu* | employee_id | Andrew | Dodd | EMP1111 | /tmp/photos/ugly_headshot.jpg |
Example - Update uses external key mapping over matched object¶
This is an example that is to do with the changes for Issue 1
If Nsync is ‘updating’ objects but their ‘match fields’ change, Nsync will still update the ‘correct’ object.
A common occurrence of this is if the sync data is being produced from a database and an in-row update occurs which changes the match fields but leaves the ‘external key’ (i.e. an SQL ‘UPDATE ... WHERE ...’ statement).
E.g. A person table might look like this:
ID (a DB sequence) | Employee Number | Name |
---|---|---|
10123 | EMP001 | Andrew Dodd |
This could be used to produce an Nsync input CSV like this:
external_key | action_flags | match_on | employee_number | name |
---|---|---|---|---|
10123 | cu* | employee_number | EMP001 | Andrew Dodd |
This would result in an “Andrew Dodd, EMP001” Person object being created and/or updated with an ExternalKeyMapping object holding the ‘10123’ id and a link to Andrew.
If Andrew became a contractor instead of an employee, perhaps the table could be updated to look like this:
ID (a DB sequence) | Employee Number | Name |
---|---|---|
10123 | CONT999 | Andrew Dodd |
This would then produce an Nsync input CSV like this:
external_key | action_flags | match_on | employee_number | name |
---|---|---|---|---|
10123 | cu* | employee_number | CONT999 | Andrew Dodd |
Nsync will use the ExternalKeyMapping object if it is available instead of relying on the ‘match fields’. In this case, the resulting action will cause the Andrew Dodd object to change its ‘employee_number’. This is instead of Nsync using the ‘employee_number’ for finding Andrew.
NB: In this instance, Nsync will also delete any objects that have the ‘new’ match field but are not pointed to by the external key.
Example - Delete tricks¶
This is a list of tricky / gotchas to be aware of when deleting objects.
When syncing from external systems that have external key mappings, it is probably best to use the ‘unforced delete’. This ensures that an object is not removed until all of the external systems think it should be removed.
If using ‘forced delete’, beware that (depending on which sync policy you use) you may end up with different systems fighting over the existence of an object (i.e. one system creating the object, then another deleting it in the same sync).
A system without external key mappings cannot delete objects if it uses an ‘unforced delete’. The reason for this is that the ‘unforced delete’ only removes the model object IF AND ONLY IF it is the last remaining external key mapping. Thus, if a system without external key mappings is the source-of-truth for the removal of an object, you must use the ‘forced delete’ for it to be able to remove the objects.
Alternative Sync Policies¶
The out-of-the-box sync policies are pretty straightforward and are probably worth a read (see the policies.py
file). The system is made so that it is pretty easy for you to define your own custom policy and write a command (similar to the ones in Nsync) to use it.
- Some examples of alternative policies might be:
- Run deletes before creates and updates
- Search and execute certain actions before all others