flight.phases() returns a Pandas error on some flights #242

tomschelsen · 2022-09-16T07:51:26Z

Discussed in #241

Using traffic version 2.8.0

I have a Traffic object containing a bunch of flights, that I created from custom parsed data (not the formats/sources directly supported by the library). I am trying to keep only the climb phases.

It works ok on many flights (I had tested .phases().query('phase == "CLIMB"') on a bunch of flights from this dataset and visualised the results beforehand), but it stops on a given flight with this error :

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [13], in <cell line: 6>()
      1 # working flight : BEL3631
      2 
      3 # not working flight : BAW45EM
      5 example_flight = traffic_flights["BAW45EM"]
----> 6 example_climb = example_flight.phases().query('phase == "CLIMB"')


File /usr/local/lib/python3.8/dist-packages/traffic/algorithms/openap.py:33, in OpenAP.phases(self, twindow)
     25 fp = FlightPhase()
     26 fp.set_trajectory(
     27     (self.data.timestamp.values - np.datetime64("1970-01-01"))
     28     / np.timedelta64(1, "s"),
   (...)
     31     self.data.vertical_rate.values,
     32 )
---> 33 return self.assign(phase=fp.phaselabel(twindow=twindow)).assign(
     34     phase=lambda df: df.phase.str.replace("GND", "GROUND")
     35     .str.replace("CL", "CLIMB")
     36     .str.replace("DE", "DESCENT")
     37     .str.replace("CR", "CRUISE")
     38     .str.replace("LVL", "LEVEL")
     39 )

File /usr/local/lib/python3.8/dist-packages/traffic/core/mixins.py:304, in DataFrameMixin.assign(self, *args, **kwargs)
    298 def assign(self: T, *args: Any, **kwargs: Any) -> T:
    299     """
    300     Applies the Pandas :meth:`~pandas.DataFrame.assign` method to the
    301     underlying pandas DataFrame and get the result back in the same
    302     structure.
    303     """
--> 304     return self.__class__(self.data.assign(*args, **kwargs))

File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:4512, in DataFrame.assign(self, **kwargs)
   4509 data = self.copy()
   4511 for k, v in kwargs.items():
-> 4512     data[k] = com.apply_if_callable(v, data)
   4513 return data

File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3655, in DataFrame.__setitem__(self, key, value)
   3652     self._setitem_array([key], value)
   3653 else:
   3654     # set column
-> 3655     self._set_item(key, value)

File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3832, in DataFrame._set_item(self, key, value)
   3822 def _set_item(self, key, value) -> None:
   3823     """
   3824     Add series to DataFrame in specified column.
   3825 
   (...)
   3830     ensure homogeneity.
   3831     """
-> 3832     value = self._sanitize_column(value)
   3834     if (
   3835         key in self.columns
   3836         and value.ndim == 1
   3837         and not is_extension_array_dtype(value)
   3838     ):
   3839         # broadcast across multiple columns if necessary
   3840         if not self.columns.is_unique or isinstance(self.columns, MultiIndex):

File /usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:4535, in DataFrame._sanitize_column(self, value)
   4532     return _reindex_for_setitem(value, self.index)
   4534 if is_list_like(value):
-> 4535     com.require_length_match(value, self.index)
   4536 return sanitize_array(value, self.index, copy=True, allow_2d=True)

File /usr/local/lib/python3.8/dist-packages/pandas/core/common.py:557, in require_length_match(data, index)
    553 """
    554 Check the length of data matches the length of the index.
    555 """
    556 if len(data) != len(index):
--> 557     raise ValueError(
    558         "Length of values "
    559         f"({len(data)}) "
    560         "does not match length of index "
    561         f"({len(index)})"
    562     )

ValueError: Length of values (971) does not match length of index (1182)

I attach here as csv a subest of the dataset (one flight, the result of traffic_flights["BAW45EM"] in that case) for which flight.phases() aborts :
problematic_flight_for_phases_climb.csv

Thank you :)

The text was updated successfully, but these errors were encountered:

xoolive · 2022-09-16T17:30:34Z

There must be some invalid value somewhere in the data
(I did not have time to dig much further)

If you do f.resample('5s').phases() it just works. (I took 5 seconds because it looks like the sampling rate you already have)

If that fix/hack is enough for you, let us close the issue.
If you want to dig further in the data, understand what happens or find a further bug, please comment with what you expect from now on.

tomschelsen · 2022-09-19T07:10:23Z

The fix is fine by me ; part of me would like to understand the why but hey, got to move forward to the next steps... ;)
Closing the issue, thank you very much for your help.

tomschelsen closed this as completed Sep 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flight.phases() returns a Pandas error on some flights #242

flight.phases() returns a Pandas error on some flights #242

tomschelsen commented Sep 16, 2022

xoolive commented Sep 16, 2022

tomschelsen commented Sep 19, 2022

flight.phases() returns a Pandas error on some flights #242

flight.phases() returns a Pandas error on some flights #242

Comments

tomschelsen commented Sep 16, 2022

Discussed in #241

xoolive commented Sep 16, 2022

tomschelsen commented Sep 19, 2022