A Python Program to Help You Back Up Your Files Automatically

0
3886
Advertisement

Lost or deleted files are a common phenomenon. To ensure file and folder security, prudence dictates you take a backup. To avoid the drudgery of physically doing so every now and then, it is best to automate the process. The author has created a Python program for just such a situation.

We often create new documents, files and folders in our computers. Sometimes we accidentally or mistakenly delete documents. Consider that you have made an important presentation after spending a lot of time on it. But accidentally, your computer hard disk crashes. It is a painful situation. People often take a backup of files. But it is very difficult to back up after every one hour or after every minute. So, to overcome this problem, I am going to demonstrate a Python script that I’ve created, which will keep taking backups of your file or folder after a specified period of time (specified by you). The name of the program is sync.py. It is for Windows and is compatible with Python 2 and Python 3.

The program contains the following three files:

Sync.py: The main program

Advertisement

Sync1.ini: The configuration file

Logger1.py: The module for logger support

Sync.log is a file created by the sync.py.

Let us now understand the code of sync.py and look at how it works.

1. Import the essential library to be used.

Import configparser.

Import time.

Import shutil.

Import hashlib.

From the distutils.dir_util import copy_tree.

From the collections, import OrderedDict.

Import the OS.

Import logger1 as log1.

The following code reads the Sync1.ini file:

def ConfRead():

config = configparser.ConfigParser()

config.read(“Sync1.ini”)

return (dict(config.items(“Para”)))

Shown below are some of the variables obtained from the Sync.ini file:

All_Config = ConfRead()

Freq = int(All_Config.get(“freq”))*60

Total_time = int(All_Config.get(“total_time”))*60

repeat = int(Total_time/Freq)
Figure 1: Making an exe file using pyinstaller
Figure 2: Place the exe file in the Windows folder

The following function md5 is used to calculate the hash of the file. If you modify a file, then the name remains the same but the hash gets changed.

def md5(fname,size=4096):

hash_md5 = hashlib.md5()

with open(fname, “rb”) as f:

for chunk in iter(lambda: f.read(size), b””):

hash_md5.update(chunk)

return hash_md5.hexdigest()

The following function copies the whole directory with intermediaries:

def CopyDir(from1, to):

copy_tree(from1, to)

The following function just copies one file to the destination location:

def CopyFiles(file, path_to_copy):

shutil.copy(file,path_to_copy)
Figure 3: CMD default path
Figure 4: Sync command

The following function creates a dictionary, which contains the file names with the hash of the files. The function takes the source location and makes a dictionary of all the files present:

def OriginalFiles():

drive = All_Config.get(“from”)

Files_Dict = OrderedDict()

print (drive)

for root, dir, files in os.walk(drive, topdown=True):

for file in files:

file = file.lower()

file_path = root+’\\’+file

try:

hash1 = md5(file_path,size=4096)

#modification_time = int(os.path.getmtime(file_path))

rel_path = file_path.strip(drive)

Files_Dict[(hash1,rel_path)]= file_path

except Exception as e :

log1.logger.error(‘Error Original files: {0}’.format(e))

return Files_Dict

The following function creates a dictionary that contains file names with a hash of the files. The function takes the destination location and gets all present files and makes a dictionary. If the root folder is not present, then it calls the CopyDir function.

def Destination():

Files_Dict = OrderedDict()

from1 = All_Config.get(“from”)

to= All_Config.get(“to”)

dir1= from1.rsplit(“\\”,1)[1]

drive = to+dir1

#print (drive)

try:

if os.path.isdir(drive):

for root, dir, files in os.walk(drive, topdown=True):

for file in files:

file = file.lower()

file_path = root+’\\’+file

try:

hash1 = md5(file_path,size=4096)

#modification_time = int(os.path.getmtime(file_path))

rel_path = file_path.strip(drive)

Files_Dict[(hash1,rel_path)]= file_path

except Exception as e :

log1.logger.error(‘Error Destination foor loop: {0}’.format(e))

return Files_Dict

else :

CopyDir(from1,drive)

log1.logger.info(‘Full folder: {0} copied’.format(from1))

return None

except Exception as e :

log1.logger.error(‘Error Destination: {0}’.format(e))

The following functions define the logic:

  • If file has been created with a folder.
  • If file has been modified.
Figure 5: Full folder is copied
Figure 6: After modifying the file

In both the cases, the following piece of code just compares the original and the destination’s dictionaries. If any file gets created or modified, then the interpreter copies the file from source and pastes it into the destination.

def LogicCompare():

from1 = All_Config.get(“from”)

to= All_Config.get(“to”)

Dest_dict = Destination()

if Dest_dict:

Source_dict = OriginalFiles()

remaining_files = set(Source_dict.keys())- set(Dest_dict.keys())

remaining_files= [Source_dict.get(k) for k in remaining_files]

for file_path in remaining_files:

try:

log1.logger.info(‘File: {0}’.format(file_path))

dir, file = file_path.rsplit(“\\”,1)

rel_dir = from1.rsplit(“\\”,1)[1]

rel_dir1 = dir.replace(from1,””)

dest_dir = to+rel_dir+”\\”+rel_dir1

if not os.path.isdir(dest_dir):

os.makedirs(dest_dir)

CopyFiles(file_path, dest_dir)

except Exception as e:

log1.logger.error(‘Error LogicCompare: {0}’.format(e))

The following piece of code uses loop to run the code again and again:

i = 0

while True:

if i >= repeat:

break

LogicCompare()

time.sleep(Freq)

i = i +1

Let us see the content of file Sync1.ini

[Para]

From = K:\testing1

To = E:\

Freq = 1

Total_time = 5

In the above code:

From: Specifies what the source means and takes the backup of the testing1 folder.

To: Specifies where to take the backup.

Freq: Takes the backup after a specified minute.

Total_time: Runs the code for Total_time minutes.

Let us look at the code of logger1.py:

import logging

logger = logging.getLogger(“Mohit”)

logger.setLevel(logging.INFO)

fh = logging.FileHandler(“Sync.log”)

formatter = logging.Formatter(‘%(asctime)s - %(levelname)s - %(message)s’)

fh.setFormatter(formatter)

logger.addHandler(fh)

The above code is very simple and will work in INFO mode.

If you don’t want to bother running the code using the interpreter, make a Windows exe file and this will work as a command.

In order to convert this, let us take the help of pyinstaller. I have already installed that module.

The command in Figure 1 converts your code into an exe file. To run it, the Python interpreter is not needed.

Figure 7: After creating the new file
Figure 8: Output when the pen -drive is ot present

How to run the program

After executing the command as shown in Figure 1, check the folder named Sync. In this folder, check the folder named dist, where you will get the .exe file. Now copy this .exe file and paste it into the C:/Windows folder, as shown in Figure 2.

Now open the command prompt. Check the current folder as shown in Figure 3. In my PC, the default prompt path is c:/user/Mohit. In your PC, it may be different. So copy the Sync1.ini file and paste it in the c:/user/<your-name> folder.

Now plug in an external pen-drive. Check the pen-drive drive letter, which in my PC is E.

Based on your PC configuration, change the parameter of Sync1.ini placed in the C:/user/<your-name> directory.

Now open the command prompt and type the command as shown in Figure 4.

Now check your pen-drive. Look at sync.log, which was created in the folder c:/user/<your-name>

Four cases are possible:

  1. 1. When the whole folder is not present in the pen-drive (Figure 5).
  2. 2. When you modify the existing file (Figure 6).
  3. 3. When you create a new file (Figure 7).
  4. The last case is a negative test case when the pen-drive is not present (Figure 8).

Advertisement

LEAVE A REPLY

Please enter your comment!
Please enter your name here