Replace Highcharts plotting library with open source alternative

I am working on implementing Altair, https://altair-viz.github.io/index.html

It is possible to implement charts where we calculate the mean or median at the server, then send the results to the browser:

# Create a Pandas DataFrame from the database events and annotations
df = pd.DataFrame.from_records(database_events.values(db_display_name_relationship, "db_series_names_to_use").annotate(**summary_annotations).order_by("db_series_names_to_use"))
df.rename(columns={db_display_name_relationship:"x_ray_system_name"}, inplace=True)
df.rename(columns={"db_series_names_to_use":"data_point_name"}, inplace=True)

# Change Decimal values to float so that to_json() works (Decimal values can't be JSON serialised)
df["mean"] = df["mean"].astype(float)

# Create a chart
chart = alt.Chart(df).mark_bar().encode(
    column=alt.Column("data_point_name"),
    x=alt.X("x_ray_system_name"),
    y=alt.Y("mean"),
    color="x_ray_system_name"
).interactive()

‌

It is also easy to implement charts where we supply Altair with all of the data, and get the browser’s JavaScript to calculate the mean or median:

# Create a Pandas DataFrame from the database events
df = pd.DataFrame.from_records(database_events.values(db_display_name_relationship, "db_series_names_to_use", db_value_name))
df.rename(columns={db_display_name_relationship:"x_ray_system_name"}, inplace=True)
df.rename(columns={"db_series_names_to_use":"data_point_name"}, inplace=True)

# Change Decimal values to float so that to_json() works (Decimal values can't be JSON serialised)
df["mean"] = df[db_value_name].astype(float)

# Disable the max rows allowed in an Altair chart
alt.data_transformers.disable_max_rows()

# Create a chart
chart = alt.Chart(df).mark_bar().encode(
    column=alt.Column("data_point_name"),
    x=alt.X("x_ray_system_name"),
    y=alt.Y(db_value_name, "mean"),
    color="x_ray_system_name"
).interactive()

Sending all of the data to the browser could result in problems (maybe memory issues, or slow behaviour?). However, I wonder if keeping all the data may have advantages that I am not aware of at the moment (easy histograms in the browser, perhaps?).

‌

29/8/2020 update: I’m now using all of the data in the data frame so that I can easily use Altair for average-over-time charts. I’ll need to test whether this is OK for large amounts of data from a server and client perspective. I’ll need to try it on a “standard” computer’s browser (my laptop has 32 GB RAM, and my desktop 64 GB).

Might be useful: https://medium.com/plotly/introducing-plotly-express-808df010143d

To do:

FAILED: Sort out the width of the new Altair charts
- Replaced Altair with Plotly - size now more-or-less fixed
DONE: Plumb in over-time chart time period
DONE: Add choice of box plot or bar chart for “average” charts (if “median” selected then a boxplot is produced)
DONE: Add static Plotly JavaScript files rather than current links to latest on-line versions
DONE: Rationalise chart production code to avoid duplication of database queries (“study” and “request” level queries can share the same routines, with appropriate fields collected in the Pandas DataFrame)
DONE: Separate the new chart code into one function per chart. Pass each of these the Pandas DataFrame
DONE: Add charts of CTDIvol and DLP vs patient weight
DONE: Add chart of requested procedure DLP over time
DONE: Add charts of acquisition protocol CTDI and DLP over time
DONE: Workload chart: link an hourly chart to selected data in the daily chart? [actually sub-divided each day into hours in the same chart]
DONE: Make it possible to sort average chart x-axes by frequency, name or value. Boxplots and frequency chart data is also sortable.
DONE: Update mammo, radiographic and fluoro charts with this new code
DONE: Add DX acquisition/study/request DAP vs patient mass
DONE: Calculate histogram data using Numpy in Python and creating a Plotly barchart rather than embedding all the data in a Plotly Histogram chart (https://numpy.org/doc/stable/reference/generated/numpy.histogram.html). Should improve browser responsiveness if large number of histograms plotted.
DONE: Unlink the y-axes of the over-time sub-plots
DONE: added mammography charts of average AGD and average AGD vs. compressed thickness
DONE: Add MG chart of average acquisition AGD over time
DONE: Add RF charts of study description and requested procedure name DAP over time
DONE: Add RF chart of study description and requested procedure name DAP and frequency per physician
DONE: add option to fix histogram bins across subplots, or have bins calculated per subplot
FIXED: Switching charts “on” from home page before doing anything causes “Not found” error when trying to redirect to http://127.0.0.1:8000/&plotCharts=on
ON-HOLD: Add RF chart of # events per study description and requested procedure name - future
ON-HOLD: Add RF filter for physician - future
PARTIAL: Add button to download plotted data as csv; DONE for average bar charts and frequency charts

‌

Comments (324)