Please send any questions and the script when ready to: mario.farina@kantarmedia.com
You can also publish the script to your GitHub if you have one.
The task is to write a script that reads the two files Data file 1.csv and Data file 2.csv and combines them into a new file. The result should be the same as the file Data file full.csv, which is only provided as an example of what the result should look like.
Both files are CSVs (comma separated) with two fields:
- Serial ID
- A variable
The output of the script should have the following fields
- Serial ID
- Variable from Data file 1
- Variable from Data file 2
Effectively your program will do the same as an Excel lookup, which is matching the Serial IDs in both files to add the information from Data file 2 to Data file 1.
In order to complete the task you need to know how to:
Reading and writing files
- http://www.afterhoursprogramming.com/tutorial/Python/Reading-Files/
- http://www.pythonforbeginners.com/files/reading-and-writing-files-in-python
Deal with substrings
- http://www.tutorialspoint.com/python/python_strings.htm http://stackoverflow.com/questions/663171/is-there-a-way-to-substring-a-string-in-python
Loop through every line in the files
Split comma delimited fields in strings
Use dictionaries
There is also a Script.py file with some pseudocode to get you started.
Install the Anaconda package. I recommend 3.6.
Once the installation is complete, you can launch Spyder and start scripting! This tutorial can help.
Feel free to complete the exercise in C# or Java if you prefer.