温馨提示:本文翻译自stackoverflow.com,查看原文请点击:其他 - How to extract data from XML conditionally over multiple nodes using Python?
jmespath pandas parsing python xml

其他 - 如何使用Python有条件地在多个节点上从XML提取数据?

发布于 2020-08-05 01:13:32

我具有以下XML结构(非常大的文件):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE population SYSTEM "http://www.matsim.org/files/dtd/population_v6.dtd">

<population desc="Switzerland Baseline">
<person id="100127">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >11</attribute>
            <attribute name="censusId" class="java.lang.Integer" >224170</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >no</attribute>
            <attribute name="htsId" class="java.lang.Long" >9112520200003</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >false</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >true</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan selected="yes">
            <activity type="home" link="220029" facility="home52627" x="647557.28056" y="6864961.034271" end_time="07:49:09" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="access_walk" dep_time="07:49:09" trav_time="00:09:38">
                <route type="generic" start_link="220029" end_link="pt_StopPoint:59229" trav_time="00:09:38" distance="692.895772305751"></route>
            </leg>
            <activity type="pt interaction" link="220029" x="647557.28056" y="6864961.034271" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="07:58:47" trav_time="00:13:13">
                <route type="enriched_pt" start_link="pt_StopPoint:59229" end_link="pt_StopPoint:59585" trav_time="00:13:13" distance="5488.133844246115">{"inVehicleTime":720.0,"transferTime":73.0,"accessStopIndex":1,"egressStopindex":11,"transitRouteId":"97574868-1_240825","transitLineId":"100110001:1","departureId":"97593123-1_240438_07:58:00"}</route>
            </leg>
            <activity type="pt interaction" link="220029" x="647557.28056" y="6864961.034271" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="08:12:00" trav_time="00:00:32">
                <route type="generic" start_link="pt_StopPoint:59585" end_link="pt_StopPoint:59627" trav_time="00:00:32" distance="39.422182688315836"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59585" x="652159.1523468373" y="6862257.098785016" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="08:12:32" trav_time="00:17:27">
                <route type="enriched_pt" start_link="pt_StopPoint:59627" end_link="pt_StopPoint:59624" trav_time="00:17:27" distance="5813.159959644434">{"inVehicleTime":960.0,"transferTime":87.14818109307089,"accessStopIndex":12,"egressStopindex":25,"transitRouteId":"95327450-1_295653","transitLineId":"100110004:4","departureId":"95327497-1_295565_07:59:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59585" x="652159.1523468373" y="6862257.098785016" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="08:30:00" trav_time="00:11:54">
                <route type="generic" start_link="pt_StopPoint:59624" end_link="178690" trav_time="00:11:54" distance="856.0619451133888"></route>
            </leg>
            <activity type="education" link="178690" facility="16842" x="651100.0" y="6858204.3" start_time="08:19:09" end_time="17:49:09" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="access_walk" dep_time="17:49:09" trav_time="00:04:22">
                <route type="generic" start_link="178690" end_link="1185" trav_time="00:04:22" distance="313.6764640548623"></route>
            </leg>
            <activity type="pt interaction" link="178690" x="651100.0" y="6858204.3" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="17:53:31" trav_time="00:05:29">
                <route type="enriched_pt" start_link="1185" end_link="413156" trav_time="00:05:29" distance="1302.0939972036185">{"inVehicleTime":300.0,"transferTime":29.0,"accessStopIndex":0,"egressStopindex":4,"transitRouteId":"95450970-1_205771","transitLineId":"100100088:88","departureId":"95450972-1_205754_17:54:00"}</route>
            </leg>
            <activity type="pt interaction" link="178690" x="651100.0" y="6858204.3" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="17:59:00" trav_time="00:01:41">
                <route type="generic" start_link="413156" end_link="pt_StopPoint:59547" trav_time="00:01:41" distance="122.21107200064658"></route>
            </leg>
            <activity type="pt interaction" link="413156" x="651043.1290909288" y="6859441.216973967" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="18:00:41" trav_time="00:18:18">
                <route type="enriched_pt" start_link="pt_StopPoint:59547" end_link="pt_StopPoint:59244" trav_time="00:18:18" distance="7166.081827475872">{"inVehicleTime":1080.0,"transferTime":18.15743999946426,"accessStopIndex":13,"egressStopindex":27,"transitRouteId":"93653132-1_291567","transitLineId":"100110006:6","departureId":"93653147-1_291586_17:45:00"}</route>
            </leg>
            <activity type="pt interaction" link="413156" x="651043.1290909288" y="6859441.216973967" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="18:19:00" trav_time="00:00:39">
                <route type="generic" start_link="pt_StopPoint:59244" end_link="pt_StopPoint:59236" trav_time="00:00:39" distance="46.97102023296232"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59244" x="648272.9101174484" y="6863974.735813766" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="18:19:39" trav_time="00:03:20">
                <route type="enriched_pt" start_link="pt_StopPoint:59236" end_link="pt_StopPoint:59229" trav_time="00:03:20" distance="1073.5096075636977">{"inVehicleTime":180.0,"transferTime":20.857483139203396,"accessStopIndex":16,"egressStopindex":18,"transitRouteId":"97575531-1_238697","transitLineId":"100110001:1","departureId":"97575477-1_238631_17:57:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59244" x="648272.9101174484" y="6863974.735813766" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="18:23:00" trav_time="00:09:38">
                <route type="generic" start_link="pt_StopPoint:59229" end_link="220029" trav_time="00:09:38" distance="692.895772305751"></route>
            </leg>
            <activity type="home" link="220029" facility="home52627" x="647557.28056" y="6864961.034271" start_time="18:19:09" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
        </plan>

    </person>

    <person id="100128">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >11</attribute>
            <attribute name="censusId" class="java.lang.Integer" >224170</attribute>
            <attribute name="employed" class="java.lang.Boolean" >false</attribute>
            <attribute name="hasLicense" class="java.lang.String" >no</attribute>
            <attribute name="htsId" class="java.lang.Long" >1140500200003</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >false</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >false</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan selected="yes">
            <activity type="home" link="220029" facility="home52627" x="647557.28056" y="6864961.034271" end_time="07:43:26" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="walk" dep_time="07:43:26" trav_time="00:58:35">
                <route type="generic" start_link="220029" end_link="624543" trav_time="00:58:35" distance="4218.5741465571855"></route>
            </leg>
            <activity type="education" link="624543" facility="34450" x="650799.2" y="6865103.7" start_time="07:48:26" end_time="15:33:26" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="walk" dep_time="15:33:26" trav_time="00:58:35">
                <route type="generic" start_link="624543" end_link="220029" trav_time="00:58:35" distance="4218.5741465571855"></route>
            </leg>
            <activity type="home" link="220029" facility="home52627" x="647557.28056" y="6864961.034271" start_time="15:43:26" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
        </plan>

    </person>


<!-- ====================================================================== -->

    <person id="10089246">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >46</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >some</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3700500</attribute>
            <attribute name="employed" class="java.lang.Boolean" >true</attribute>
            <attribute name="hasLicense" class="java.lang.String" >no</attribute>
            <attribute name="htsId" class="java.lang.Long" >1158140100001</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >true</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >true</attribute>
            <attribute name="sex" class="java.lang.String" >f</attribute>
        </attributes>
        <plan score="-1.696111111111111" selected="yes">
            <activity type="outside" link="317294" facility="outside_39" x="654341.9834405242" y="6866654.304185225" end_time="08:53:00" >
            </activity>
            <leg mode="access_walk" dep_time="08:53:00" trav_time="00:04:19">
                <route type="generic" start_link="317294" end_link="pt_StopPoint:59765" trav_time="00:04:19" distance="310.7452086139442"></route>
            </leg>
            <activity type="pt interaction" link="317294" x="654341.9834405242" y="6866654.304185225" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="08:57:19" trav_time="00:24:41">
                <route type="enriched_pt" start_link="pt_StopPoint:59765" end_link="pt_StopPoint:59743" trav_time="00:24:41" distance="6008.570171146984">{"inVehicleTime":1440.0,"transferTime":41.0,"accessStopIndex":4,"egressStopindex":15,"transitRouteId":"97743554-1_262679","transitLineId":"100112003:T3B","departureId":"97743496-1_262473_08:52:00"}</route>
            </leg>
            <activity type="pt interaction" link="317294" x="654341.9834405242" y="6866654.304185225" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="09:22:00" trav_time="00:06:08">
                <route type="generic" start_link="pt_StopPoint:59743" end_link="536071" trav_time="00:06:08" distance="440.9508045191164"></route>
            </leg>
            <activity type="outside" link="536071" facility="outside_40" x="657013.4348947266" y="6862274.328993361" end_time="09:02:43" >
            </activity>
            <leg mode="access_walk" dep_time="09:02:43" trav_time="00:12:24">
                <route type="generic" start_link="536071" end_link="pt_StopPoint:59491" trav_time="00:12:24" distance="892.1236691658619"></route>
            </leg>
            <activity type="pt interaction" link="536071" x="657013.4348947266" y="6862274.328993361" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="09:15:07" trav_time="00:24:52">
                <route type="enriched_pt" start_link="pt_StopPoint:59491" end_link="pt_StopPoint:59490" trav_time="00:24:52" distance="8713.681534957017">{"inVehicleTime":1440.0,"transferTime":52.315713530624635,"accessStopIndex":1,"egressStopindex":19,"transitRouteId":"95010770-1_255913","transitLineId":"100110009:9","departureId":"95010790-1_255893_09:14:00"}</route>
            </leg>
            <activity type="pt interaction" link="536071" x="657013.4348947266" y="6862274.328993361" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="09:40:00" trav_time="00:00:29">
                <route type="generic" start_link="pt_StopPoint:59490" end_link="594991" trav_time="00:00:29" distance="34.65503897934853"></route>
            </leg>
            <activity type="work" link="594991" facility="28150" x="649434.4" y="6863802.0" start_time="09:21:49" end_time="16:51:49" >
                <attributes>
                    <attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
                </attributes>
            </activity>
            <leg mode="access_walk" dep_time="16:51:49" trav_time="00:05:31">
                <route type="generic" start_link="594991" end_link="635504" trav_time="00:05:31" distance="397.1565592339538"></route>
            </leg>
            <activity type="pt interaction" link="594991" x="649434.4" y="6863802.0" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="16:57:20" trav_time="00:05:40">
                <route type="enriched_pt" start_link="635504" end_link="540678" trav_time="00:05:40" distance="998.2508127070034">{"inVehicleTime":180.0,"transferTime":160.0,"accessStopIndex":8,"egressStopindex":11,"transitRouteId":"96638007-1_223844","transitLineId":"100100043:43","departureId":"96638008-1_223952_16:43:00"}</route>
            </leg>
            <activity type="pt interaction" link="594991" x="649434.4" y="6863802.0" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="17:03:00" trav_time="00:01:22">
                <route type="generic" start_link="540678" end_link="478229" trav_time="00:01:22" distance="98.80060166539256"></route>
            </leg>
            <activity type="pt interaction" link="540678" x="650551.9063578482" y="6864139.534388704" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="17:04:22" trav_time="00:08:37">
                <route type="enriched_pt" start_link="478229" end_link="159089" trav_time="00:08:37" distance="4423.120382623623">{"inVehicleTime":420.0,"transferTime":97.66616527883889,"accessStopIndex":0,"egressStopindex":2,"transitRouteId":"97726763-1_7371","transitLineId":"800:E","departureId":"97726763-1_7371_17:06:00"}</route>
            </leg>
            <activity type="pt interaction" link="540678" x="650551.9063578482" y="6864139.534388704" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="17:13:00" trav_time="00:06:06">
                <route type="generic" start_link="159089" end_link="317294" trav_time="00:06:06" distance="438.9313005772999"></route>
            </leg>
            <activity type="outside" link="317294" facility="outside_39" x="654341.9834405242" y="6866654.304185225" end_time="17:20:00" >
            </activity>
        </plan>

    </person>

<!-- ====================================================================== -->

    <person id="10089479">
        <attributes>
            <attribute name="age" class="java.lang.Integer" >55</attribute>
            <attribute name="bikeAvailability" class="java.lang.String" >none</attribute>
            <attribute name="carAvailability" class="java.lang.String" >none</attribute>
            <attribute name="censusId" class="java.lang.Integer" >3709337</attribute>
            <attribute name="employed" class="java.lang.Boolean" >true</attribute>
            <attribute name="hasLicense" class="java.lang.String" >yes</attribute>
            <attribute name="htsId" class="java.lang.Long" >1151660300002</attribute>
            <attribute name="isOutside" class="java.lang.Boolean" >true</attribute>
            <attribute name="isPassenger" class="java.lang.Boolean" >false</attribute>
            <attribute name="ptSubscription" class="java.lang.Boolean" >true</attribute>
            <attribute name="sex" class="java.lang.String" >m</attribute>
        </attributes>
        <plan score="-1.8791666666666669" selected="yes">
            <activity type="outside" link="327762" facility="outside_37" x="655632.6715938409" y="6861058.197253688" end_time="07:14:00" >
            </activity>
            <leg mode="access_walk" dep_time="07:14:00" trav_time="00:03:36">
                <route type="generic" start_link="327762" end_link="pt_StopPoint:59248" trav_time="00:03:36" distance="258.02386903819104"></route>
            </leg>
            <activity type="pt interaction" link="327762" x="655632.6715938409" y="6861058.197253688" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="07:17:36" trav_time="00:08:24">
                <route type="enriched_pt" start_link="pt_StopPoint:59248" end_link="pt_StopPoint:59491" trav_time="00:08:24" distance="1164.8022063210396">{"inVehicleTime":240.0,"transferTime":264.0,"accessStopIndex":27,"egressStopindex":30,"transitRouteId":"94953838-1_253233","transitLineId":"100110009:9","departureId":"94953856-1_253227_06:46:00"}</route>
            </leg>
            <activity type="pt interaction" link="327762" x="655632.6715938409" y="6861058.197253688" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="07:26:00" trav_time="00:08:38">
                <route type="generic" start_link="pt_StopPoint:59491" end_link="627524" trav_time="00:08:38" distance="621.2719061094203"></route>
            </leg>
            <activity type="outside" link="627524" facility="outside_42" x="657205.6243098765" y="6861527.12401084" end_time="07:20:30" >
            </activity>
            <leg mode="access_walk" dep_time="07:20:30" trav_time="00:14:45">
                <route type="generic" start_link="627524" end_link="pt_StopPoint:59737" trav_time="00:14:45" distance="1061.5974852304669"></route>
            </leg>
            <activity type="pt interaction" link="627524" x="657205.6243098765" y="6861527.12401084" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="07:35:15" trav_time="00:29:45">
                <route type="enriched_pt" start_link="pt_StopPoint:59737" end_link="pt_StopPoint:59117" trav_time="00:29:45" distance="8166.44067225298">{"inVehicleTime":1680.0,"transferTime":105.0,"accessStopIndex":0,"egressStopindex":15,"transitRouteId":"97595733-1_258113","transitLineId":"100112013:T3A","departureId":"97595616-1_258235_07:37:00"}</route>
            </leg>
            <activity type="pt interaction" link="627524" x="657205.6243098765" y="6861527.12401084" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="08:05:00" trav_time="00:02:23">
                <route type="generic" start_link="pt_StopPoint:59117" end_link="665109" trav_time="00:02:23" distance="171.20821167179105"></route>
            </leg>
            <activity type="outside" link="665109" facility="outside_43" x="650473.6924192724" y="6858251.112472806" end_time="16:42:11" >
            </activity>
            <leg mode="outside" dep_time="16:42:11" trav_time="00:00:00">
                <route type="generic" start_link="665109" end_link="665109" trav_time="00:00:00" distance="0.0"></route>
            </leg>
            <activity type="outside" link="665109" facility="outside_43" x="650473.6924192724" y="6858251.112472806" end_time="17:46:00" >
            </activity>
            <leg mode="transit_walk" dep_time="17:46:00" trav_time="00:04:53">
                <route type="generic" start_link="665109" end_link="231014" trav_time="00:04:53" distance="352.36602131902845"></route>
            </leg>
            <activity type="outside" link="231014" facility="outside_44" x="650420.9982895704" y="6857985.233069532" end_time="17:46:52" >
            </activity>
            <leg mode="access_walk" dep_time="17:46:52" trav_time="00:05:38">
                <route type="generic" start_link="231014" end_link="pt_StopPoint:59624" trav_time="00:05:38" distance="405.44328112370425"></route>
            </leg>
            <activity type="pt interaction" link="231014" x="650420.9982895704" y="6857985.233069532" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="17:52:30" trav_time="00:04:30">
                <route type="enriched_pt" start_link="pt_StopPoint:59624" end_link="pt_StopPoint:59546" trav_time="00:04:30" distance="1583.1246141973938">{"inVehicleTime":180.0,"transferTime":90.0,"accessStopIndex":1,"egressStopindex":4,"transitRouteId":"95294927-1_296749","transitLineId":"100110004:4","departureId":"95294963-1_297773_17:52:00"}</route>
            </leg>
            <activity type="pt interaction" link="231014" x="650420.9982895704" y="6857985.233069532" max_dur="00:00:00" >
            </activity>
            <leg mode="transit_walk" dep_time="17:57:00" trav_time="00:01:28">
                <route type="generic" start_link="pt_StopPoint:59546" end_link="pt_StopPoint:59547" trav_time="00:01:28" distance="106.68356774609235"></route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59546" x="650982.2244857589" y="6859565.71604913" max_dur="00:00:00" >
            </activity>
            <leg mode="pt" dep_time="17:58:28" trav_time="00:17:31">
                <route type="enriched_pt" start_link="pt_StopPoint:59547" end_link="pt_StopPoint:59243" trav_time="00:17:31" distance="6574.567657725524">{"inVehicleTime":1020.0,"transferTime":31.097026878254837,"accessStopIndex":14,"egressStopindex":27,"transitRouteId":"93654363-1_293093","transitLineId":"100110006:6","departureId":"93654337-1_293171_17:41:00"}</route>
            </leg>
            <activity type="pt interaction" link="pt_StopPoint:59546" x="650982.2244857589" y="6859565.71604913" max_dur="00:00:00" >
            </activity>
            <leg mode="egress_walk" dep_time="18:16:00" trav_time="00:03:22">
                <route type="generic" start_link="pt_StopPoint:59243" end_link="327762" trav_time="00:03:22" distance="241.49593470368671"></route>
            </leg>
            <activity type="outside" link="327762" facility="outside_37" x="655632.6715938409" y="6861058.197253688" end_time="18:12:00" >
            </activity>
        </plan>

    </person>
</population>

基本上,我想解析不同的节点并提取数据。这些是应满足的条件以及我想提取的信息:

  • 来自所有 person
  • 如果plansselcected="yes"
  • 如果legmode="pt"提取物dep_time
  • 和从route提取物中start_link

到目前为止,我的方法以一个空列表结尾,我无法解释自己。目标是在dataframe,其中我有一个针对dep_time和的start_link

import gzip
import xmltodict
import jmespath
import xmltodict
import pandas as pd

box = xmltodict.parse(gzip.open(file, 'r'))
expression = jmespath.compile('population.person[].plan[?"@selected"==`yes`].leg[?"@mode"==`pt`].["@dep_time"].route[].["@start_link"]')
coords = expression.search(box)
coords = pd.DataFrame.from_dict(coords)

使用上面提供的数据ID预期以下输出

    dep_time    start_link
1   "07:58:47"  "pt_StopPoint:59229"   
2   "08:12:32"  "pt_StopPoint:59627"
3   "17:53:31"  "1185"
4   "18:00:41"  "pt_StopPoint:59547"
5   "18:19:39"  "pt_StopPoint:59236" #this is the last entry of the first person, the second doesnt have any pt trips
6   "08:57:19"  "pt_StopPoint:59248" #here follow the rest of the entries of the third person.
7   ...         ...
8   ...         ...
...

非常感谢您的帮助!

查看更多

提问者
Yves
被浏览
7
sammywemmy 2020-04-26 06:42

在python中使用了xml.etree模块,看看它是否适合您:

import xml.etree.ElementTree as ET
root = ET.fromstring(data)
#i wrapped the data in a string
#if u r reading from a file, this should work :
#root = ET.parse('data.xml').getroot()

from collections import defaultdict
d = defaultdict(list)

for ent in root.findall('./person/plan[@selected="yes"]/leg[@mode="pt"]'):
    d['dep_time'].append(ent.get('dep_time'))
    for anoda in ent.findall('route'):
        d['route'].append(anoda.get('start_link'))

pd.DataFrame(d)

    dep_time    route
0   07:58:47    pt_StopPoint:59229
1   08:12:32    pt_StopPoint:59627
2   17:53:31    1185
3   18:00:41    pt_StopPoint:59547
4   18:19:39    pt_StopPoint:59236
5   08:57:19    pt_StopPoint:59765
6   09:15:07    pt_StopPoint:59491
7   16:57:20    635504
8   17:04:22    478229