Threats Watcher - Core Algorithm

Watcher.threats_watcher.core.cleanup()

Remove words with a creation date greater than 30 days.

Watcher.threats_watcher.core.extract_entities_and_threats(title: str) dict

Extract and clean entities and threats from title using NER model.

Watcher.threats_watcher.core.fetch_last_posts(nb_max_post)

Fetch the nb last posts for each feed.

Parameters:

nb_max_post – The deepness of the search on each feed.

Watcher.threats_watcher.core.focus_five_letters()

Focus on 5 letters long words.

Watcher.threats_watcher.core.focus_on_top(words_occurrence)

Focus on top words. Populated the database with only words with a minimum occurrence of “words_occurence” in feeds. Also triggers breaking news when threshold is exceeded. Generates AI summary for newly created or updated words.

Parameters:

words_occurrence – Word occurence in feeds.

Watcher.threats_watcher.core.get_confidence_score(confidence)

Converts a confidence level (1, 2 or 3) to a percentage.

Watcher.threats_watcher.core.get_normalized_domain(url)

Extracts and normalizes the domain from a URL without a network query (for Source objects).

Watcher.threats_watcher.core.get_pre_redirect_domain(url)

Retrieves the domain of the URL before the redirect.

Watcher.threats_watcher.core.load_feeds()

Load feeds.

Watcher.threats_watcher.core.main_watch()
Main function:
  • close_old_connections()

  • load_feeds()

  • fetch_last_posts(settings.POSTS_DEPTH)

  • tokenize_count_urls()

  • remove_banned_words()

  • focus_five_letters()

  • focus_on_top(settings.WORDS_OCCURRENCE)

  • send_threats_watcher_notifications()

Watcher.threats_watcher.core.reliability_score()

Calculates the reliability score for each TrendyWord by scanning its associated PostUrls.

Watcher.threats_watcher.core.remove_banned_words()

Clean the posts for specific patterns: BannedWord, then english + french common words.

Watcher.threats_watcher.core.send_threats_watcher_notifications(content)

Send notifications for Threats Watcher events to all enabled subscribers. Detects notification type from content and prepares the context for downstream handlers.

Watcher.threats_watcher.core.start_scheduler()
Launch planning tasks in background:
  • Fire main_watch every 30 minutes

  • Fire cleanup every day at 8 am

  • Fire weekly summary based on settings configuration

Watcher.threats_watcher.core.tokenize_count_urls()
For each title (≤ 30 days):
  • Runs NER and threat extraction,

  • Casts scores to float,

  • Counts occurrences and aggregates associated URLs.