Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to avoid file content repetition zipfile python

I need to compress multiple xml files and I achieved this with lxml, zipfile and a for loop.

My problem is that every time I re run my function the content of the compressed files are repeating (being appended in the end) and getting longer. I believe that it has to do with the writing mode a+b. I thought that by using with open at the end of the code block the files would be deleted and no more content would be added to them. I was wrong and with the other modes I do not get the intended result.

Here is my code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

def compress_package_file(self):
   bytes_buffer = BytesIO()
   with zipfile.ZipFile(bytes_buffer, 'w') as invoices_package:
       i = 1
       for invoice in record.invoice_ids.sorted('sin_number'):
           invoice_file_name = 'Invoice_' + invoice.number + '.xml'
           with open(invoice_file_name, 'a+b') as invoice_file:
               invoice_file.write(invoice._get_invoice_xml().getvalue())
               invoices_package.write(invoice_file_name, compress_type=zipfile.ZIP_DEFLATED)
           i += 1
   compressed_package = bytes_buffer.getvalue()
   encoded_compressed_file = base64.b64encode(compressed_package)               

My xml generator is in another function and works fine. But the content repeats each time I run this function. For example if I run it two times, the content of the files in the compressed file look something like this (simplified content):

<?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
    <header>
        <invoiceNumber>9</invoiceNumber>
    </header>
</facturaComputarizadaCompraVenta><?xml version='1.0' encoding='UTF-8'?>
<invoice xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="invoice.xsd">
    <header>
        <invoiceNumber>9</invoiceNumber>
    </header>
</facturaComputarizadaCompraVenta>

If I use w+b mode, the content of the files are blank.
How should my code look like to avoid this behavior?

>Solution :

I suggest you do use w+b mode, but move writing to zipfile after closing the invoice XML file.

From what you wrote it looks as you are trying to compress a file that is not yet flushed to disk, therefore with w+b it is still empty at time of compression.

So, try remove 1 level of indent for invoices_package.write line (I can’t format code properly on mobile, so can’t post whole section).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading