pandas – Convert deep nested xml file to csv in python

I’m new to Python and I try to convert xml(including namespaces) file into csv. I can’t get my head around how to iterate through the tags and extract all the data. it’s a large xml file(350MB).

<ABR recordLastUpdatedDate="20210812" replaced="N"><ABN status="ACT" ABNStatusFromDate="20000303">11000016722</ABN><EntityType><EntityTypeInd>PUB</EntityTypeInd><EntityTypeText>Australian Public Company</EntityTypeText></EntityType><MainEntity><NonIndividualName type="MN"><NonIndividualNameText>INSURANCE AUSTRALIA LIMITED</NonIndividualNameText></NonIndividualName><BusinessAddress><AddressDetails><State>NSW</State><Postcode>2000</Postcode></AddressDetails></BusinessAddress></MainEntity><ASICNumber ASICNumberType="undetermined">000016722</ASICNumber><GST status="ACT" GSTStatusFromDate="20000701" /><OtherEntity><NonIndividualName type="TRD"><NonIndividualNameText>NRMA INSURANCE LTD</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="OTN"><NonIndividualNameText>NRMA INSURANCE LTD-WORKERS COMPENSATION</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>""Wesfarmers Federation Insurance""</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Aim Dealer Services</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Aim Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Airport Buddy</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Assureme</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Australian Union Assurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Bitsy Cover</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>BuddyCover</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>C G U Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>C G U Professional Risks Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>C G U Trade Credit Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>C G U Travel Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Advantage</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Cyber Defence</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Home Warranty Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Professional Risks Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Trade Credit Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Travel Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>CGU Workers Compensation</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Dairypol</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Grain Farmers Crop Insurance Agency</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Grainpol</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>IAG Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>IAG Re Australia</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>IAL.com.au</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Insurance 4 That</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Insurance for That</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>InsureLite</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Kwikinsure</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lioncare</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley Connect</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley General</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley General Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley Marine &amp; Logistics</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley Online</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley Retail Warranty</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Lumley Special Vehicles</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Metrorider</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>mobility.ai.</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>MOTOR TRADE DIRECT</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Motorassist</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Mums &amp; Company</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>My.Place@Lumley</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>NRMA HOME SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>NRMA INSURANCE</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>NRMA INSURANCE</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>NRMA SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>NRMA SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>NZI Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Pacific Indemnity</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Poncho Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Pounce Pet Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Prestige Boat Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Prestige Car Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Prestige Home Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Quick Insure</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Rollin' Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Safer Journeys</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGI Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGI.com.au</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIC</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIC HOME SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIC INSURANCE</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIC SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIO</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIO HOME SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIO INSURANCE</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>SGIO SECURITY</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Sharecover Enterprises</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Swann Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>The United Insurance Co</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>The United Insurance Company</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Upcasa</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>VACC Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>VACC Rural Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Warrante-Link Warranty Solutions</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Warrantyassist</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Wesfarmers Federation Insurance</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>WFI</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>'WFI'</NonIndividualNameText></NonIndividualName></OtherEntity><OtherEntity><NonIndividualName type="BN"><NonIndividualNameText>Yourinsurancegroup</NonIndividualNameText></NonIndividualName></OtherEntity></ABR>

I have some XML formatted as below, and I want to extract tags from it using Python and convert them into a .csv format.

recordLastUpdatedDate, replaced, ABN_status, ABNStatusFromDate, ABN_text, EntityTypeInd_text, 
    EntityTypeText, NonIndividualName_type, NonIndividualNameText, AddressDetails_State, AddressDetails_Postcode,
    ASICNumberType, ASICNumber, GST_status, GSTStatusFromDate, NonIndividualName_type1 = ([] for i in range(16))
    cols = ["recordLastUpdatedDate", "replaced", "ABN_status", "ABNStatusFromDate",
            "ABN_text", "EntityTypeInd_text",
            "EntityTypeText", "NonIndividualName_type",
            "NonIndividualNameText", "AddressDetails_State", "AddressDetails_Postcode", "ASICNumberType",
            "ASICNumber", "GST_status",
            "GSTStatusFromDate", "NonIndividualName_type"]
    rows = []
    xmlparse = Xet.parse(file)
    root = xmlparse.getroot()
    for i in root.iter('ABR'):
        recordLastUpdatedDate.append(i.attrib.get('recordLastUpdatedDate'))
        replaced.append(i.attrib.get('replaced'))
        for x in i.findall('ABN'):
            ABN_status.append(x.attrib.get('status'))
            ABNStatusFromDate.append(x.attrib.get('ABNStatusFromDate'))
            ABN_text.append(x.text)
        for y in i.findall('EntityType'):
            for x in y.findall('EntityTypeInd'):
                EntityTypeInd_text.append(x.text)
            for x in y.findall('EntityTypeText'):
                EntityTypeText.append(x.text)
        for x in i.findall('MainEntity'):
            for y in x.findall('NonIndividualName'):
                NonIndividualName_type.append(y.attrib.get('type'))
                for i in y.findall('NonIndividualNameText'):
                    NonIndividualNameText.append(i.text)
            for i in x.findall('BusinessAddress'):
                for x in i.findall('AddressDetails'):
                    for y in x.findall('State'):
                        AddressDetails_State.append(y.text)
                    for y in x.findall('Postcode'):
                        AddressDetails_Postcode.append(y.text)
                        # print(y.text)
        for x in i.findall('ASICNumber'):
            # print(x)
            ASICNumberType.append(x.attrib.get('ASICNumberType'))
            # print(x.attrib.get('ASICNumberType'))
            ASICNumber.append(x.text)
            # print(x.text)
        for x in i.findall('GST'):
            GST_status.append(x.attrib.get('status'))
            GSTStatusFromDate.append(x.attrib.get('GSTStatusFromDate'))
            # print(x.text)
        for x in i.findall('OtherEntity'):
            for y in x.findall('NonIndividualName'):
                NonIndividualName_type1.append(y.attrib.get('type'))
                # print(y.attrib.get('type'))

    rows.append({"recordLastUpdatedDate": recordLastUpdatedDate, "replaced": replaced, "ABN_status": ABN_status,
                 "ABNStatusFromDate": ABNStatusFromDate, "ABN_text": ABN_text, "EntityTypeInd_text": EntityTypeInd_text,
                 "EntityTypeText": EntityTypeText, "NonIndividualName_type": NonIndividualName_type,
                 "NonIndividualNameText": NonIndividualNameText, "AddressDetails_State": AddressDetails_State,
                 "AddressDetails_Postcode": AddressDetails_Postcode, "ASICNumberType": ASICNumberType,
                 "ASICNumber": ASICNumber, "GST_status": GST_status, "GSTStatusFromDate": GSTStatusFromDate,
                 "NonIndividualName_type1": NonIndividualName_type1})
    df = pd.DataFrame(rows, columns=cols)
    df.to_csv('file.csv')```

and I want this result


[enter image description here][1]


  [1]: https://i.stack.imgur.com/CIfZF.png

Leave a Comment