Haswell-EP weird L2 cache evene behavior

I’m working on a Haswell machine (Shepard, E5-2698 v3), very sadly back to using the raw API.  I did a bunch of measurement on lulesh and the results looked believable.  Then I started measurement on miniFE, and I saw L2 miss rate > 100%.  Using both native and preset events (the list of events that I’ve tried in various combinations are shown below), I still get strange results.  I also ran on different nodes (each application on the same node at different times, but did this on two different nodes) and still, consistent, but strange results.  I saw that back in October, 2016, there were similar posts, but no real answer to this problem.  Can someone help?

    After discussion on the list the consensus seems to be that L2 events on recent Intel chips are unreliable/hard-to-understand especially when hardware prefetching is enabled (the default)

