Lost or deleted files are a common phenomenon. To ensure file and folder security, prudence dictates you take a backup. To avoid the drudgery of physically doing so every now and then, it is best to automate the process. The author has created a Python program to help Backup Files Automatically.
We often create new documents, files and folders in our computers. Sometimes we accidentally or mistakenly delete documents. Consider that you have made an important presentation after spending a lot of time on it. But accidentally, your computer hard disk crashes. It is a painful situation. People often take a backup of files. But it is very difficult to back up after every one hour or after every minute. So, to overcome this problem, I am going to demonstrate a Python script that I’ve created, which will keep taking backups of your file or folder after a specified period of time (specified by you). The name of the program is sync.py. It is for Windows and is compatible with Python 2 and Python 3.
The python program to help Backup Files Automatically contains the following three files:
Sync.py: The main program
Sync1.ini: The configuration file
Logger1.py: The module for logger support
Sync.log is a file created by the sync.py.
Let us now understand the code of sync.py and look at how it works.
1. Import the essential library to be used.
Import configparser. Import time. Import shutil. Import hashlib. From the distutils.dir_util import copy_tree. From the collections, import OrderedDict. Import the OS. Import logger1 as log1.
The following code reads the Sync1.ini file:
def ConfRead(): config = configparser.ConfigParser() config.read(“Sync1.ini”) return (dict(config.items(“Para”)))
Shown below are some of the variables obtained from the Sync.ini file:
All_Config = ConfRead() Freq = int(All_Config.get(“freq”))*60 Total_time = int(All_Config.get(“total_time”))*60 repeat = int(Total_time/Freq)
The following function md5 is used to calculate the hash of the file. If you modify a file, then the name remains the same but the hash gets changed.
def md5(fname,size=4096): hash_md5 = hashlib.md5() with open(fname, “rb”) as f: for chunk in iter(lambda: f.read(size), b””): hash_md5.update(chunk) return hash_md5.hexdigest()
The following function copies the whole directory with intermediaries:
def CopyDir(from1, to): copy_tree(from1, to)
The following function just copies one file to the destination location:
def CopyFiles(file, path_to_copy): shutil.copy(file,path_to_copy)
The following function creates a dictionary, which contains the file names with the hash of the files. The function takes the source location and makes a dictionary of all the files present:
def OriginalFiles(): drive = All_Config.get(“from”) Files_Dict = OrderedDict() print (drive) for root, dir, files in os.walk(drive, topdown=True): for file in files: file = file.lower() file_path = root+’\\’+file try: hash1 = md5(file_path,size=4096) #modification_time = int(os.path.getmtime(file_path)) rel_path = file_path.strip(drive) Files_Dict[(hash1,rel_path)]= file_path except Exception as e : log1.logger.error(‘Error Original files: {0}’.format(e)) return Files_Dict
The following function creates a dictionary that contains file names with a hash of the files. The function takes the destination location and gets all present files and makes a dictionary. If the root folder is not present, then it calls the CopyDir function.
def Destination(): Files_Dict = OrderedDict() from1 = All_Config.get(“from”) to= All_Config.get(“to”) dir1= from1.rsplit(“\\”,1)[1] drive = to+dir1 #print (drive) try: if os.path.isdir(drive): for root, dir, files in os.walk(drive, topdown=True): for file in files: file = file.lower() file_path = root+’\\’+file try: hash1 = md5(file_path,size=4096) #modification_time = int(os.path.getmtime(file_path)) rel_path = file_path.strip(drive) Files_Dict[(hash1,rel_path)]= file_path except Exception as e : log1.logger.error(‘Error Destination foor loop: {0}’.format(e)) return Files_Dict else : CopyDir(from1,drive) log1.logger.info(‘Full folder: {0} copied’.format(from1)) return None except Exception as e : log1.logger.error(‘Error Destination: {0}’.format(e))
The following functions define the logic:
- If file has been created with a folder.
- If file has been modified.
In both the cases, the following piece of code just compares the original and the destination’s dictionaries. If any file gets created or modified, then the interpreter copies the file from source and pastes it into the destination.
def LogicCompare(): from1 = All_Config.get(“from”) to= All_Config.get(“to”) Dest_dict = Destination() if Dest_dict: Source_dict = OriginalFiles() remaining_files = set(Source_dict.keys())- set(Dest_dict.keys()) remaining_files= [Source_dict.get(k) for k in remaining_files] for file_path in remaining_files: try: log1.logger.info(‘File: {0}’.format(file_path)) dir, file = file_path.rsplit(“\\”,1) rel_dir = from1.rsplit(“\\”,1)[1] rel_dir1 = dir.replace(from1,””) dest_dir = to+rel_dir+”\\”+rel_dir1 if not os.path.isdir(dest_dir): os.makedirs(dest_dir) CopyFiles(file_path, dest_dir) except Exception as e: log1.logger.error(‘Error LogicCompare: {0}’.format(e))
The following piece of code uses loop to run the code again and again:
i = 0 while True: if i >= repeat: break LogicCompare() time.sleep(Freq) i = i +1 Let us see the content of file Sync1.ini [Para] From = K:\testing1 To = E:\ Freq = 1 Total_time = 5
In the above code:
From: Specifies what the source means and takes the backup of the testing1 folder.
To: Specifies where to take the backup.
Freq: Takes the backup after a specified minute.
Total_time: Runs the code for Total_time minutes.
Let us look at the code of logger1.py:
import logging logger = logging.getLogger(“Mohit”) logger.setLevel(logging.INFO) fh = logging.FileHandler(“Sync.log”) formatter = logging.Formatter(‘%(asctime)s - %(levelname)s - %(message)s’) fh.setFormatter(formatter) logger.addHandler(fh)
The above code is very simple and will work in INFO mode.
If you don’t want to bother running the code using the interpreter, make a Windows exe file and this will work as a command.
In order to convert this, let us take the help of pyinstaller. I have already installed that module.
The command in Figure 1 converts your code into an exe file. To run it, the Python interpreter is not needed.
How to run the program
After executing the command as shown in Figure 1, check the folder named Sync. In this folder, check the folder named dist, where you will get the .exe file. Now copy this .exe file and paste it into the C:/Windows folder, as shown in Figure 2.
Now open the command prompt. Check the current folder as shown in Figure 3. In my PC, the default prompt path is c:/user/Mohit. In your PC, it may be different. So copy the Sync1.ini file and paste it in the c:/user/<your-name> folder.
Now plug in an external pen-drive. Check the pen-drive drive letter, which in my PC is E.
Based on your PC configuration, change the parameter of Sync1.ini placed in the C:/user/<your-name> directory.
Now open the command prompt and type the command as shown in Figure 4.
Now check your pen-drive. Look at sync.log, which was created in the folder c:/user/<your-name>
Four cases are possible:
- 1. When the whole folder is not present in the pen-drive (Figure 5).
- 2. When you modify the existing file (Figure 6).
- 3. When you create a new file (Figure 7).
-
The last case is a negative test case when the pen-drive is not present (Figure 8).