Quantcast
Channel: Active questions tagged feeds - Drupal Answers
Viewing all articles
Browse latest Browse all 189

How to read a feed that provides no GUID, but avoid false positives and false negatives of duplicates? [closed]

$
0
0

We receive alerts from a RSS feed. It contains a title, message, and pub date. For whatever reason, the organization that provides the system for creating the alerts did not include it creating a GUID, and does not seem inclined to do so.

The result was that alerts were being missed because The folks that type the alerts in often use the same title, and the title is what gets used for the GUID in the absence of one.

Right now we're using Aggregator.

One issues is that there doesn't seem to be any hook published by Aggregator that would allow access to the raw feed values, including pubdate. I wrote a pre save hook that takes the title and appends the feed item's timestamp to it. This created a truly unique GUID but swung the pendulum all the way to the other side: If I don't touch the GUID and let it use the title, it assumes duplicates when the messages are not, a false positive.

But if I make a unique GUID in the hook, nothing is ever a duplicate, even when it should be….a false negative.

Cron runs every minute and it saved the same messages over and over with each having a different constructed GUID.I'm guessing the decision whether to create the Aggregator item is downstream from the hook.

Would Feeds handle this differently? Another option would be to compare the title and message to stored items.


Viewing all articles
Browse latest Browse all 189

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>