AutoPR logo

AutoPR: Let's Automate Your Academic Promotion!

> LARG, Harbin Institute of Technology
> Central South University
> The University of Hong Kong
> ByteDance China (Seed)

* Equal contribution - Contact: qgchen@ir.hit.edu.cn, zyan@ir.hit.edu.cn, car@ir.hit.edu.cn, lbqin@csu.edu.cn

Key Highlights

AutoPR Task

Formalizes automatic translation of scholarly papers into channel-aware promotion content optimized for fidelity, alignment, and engagement.

PRBench Benchmark

512 expertly curated paper-to-post pairs across platforms, complete with weighted factual checklists and human preference judgements.

PRAgent Framework

Three-stage agentic pipeline that boosts watch time by 604% and likes by 438% over strong LLM and rule baselines.

Overview of the PRAgent framework

Curious about how AutoPR performs in practice? Visit our Showcase for detailed case studies.

Abstract

As the volume of peer-reviewed research surges, scholars increasingly rely on social platforms for discovery while authors invest significant effort in promotion to sustain visibility and citations. We introduce Automatic Promotion (AutoPR), a new task that translates research papers into faithful, engaging, and well-timed public-facing content. To enable rigorous study, we release PRBench, a multimodal benchmark linking 512 peer-reviewed articles to high-quality promotional materials and evaluating systems along three axes: Fidelity (accuracy and tone), Engagement (stakeholder targeting and appeal), and Alignment (timing and channel optimization). We further present PRAgent, a modular agentic framework that automates scholarly promotion in three stages: hierarchical content extraction with multimodal preparation; collaborative multi-agent synthesis for polished, publication-ready outputs; and platform-specific adaptation that models norms, tone, and tagging to maximize reach. Against strong LLM pipelines and rule-based tools on PRBench and downstream social metrics, PRAgent delivers substantial gains, including up to a 604\% increase in total watch time, a 438\% increase in likes, and at least a 2.9x rise in overall engagement. Ablations attribute the largest improvements to platform modeling and targeted promotion. Our results establish AutoPR as a tractable, measurable research problem and chart a path toward scalable, trustworthy, and impactful automated scholarly communication.

Automatic Promotion Task

AutoPR frames scholarly promotion as conditional generation over rich research assets. Each instance starts from a research dossier \(\mathbb{D} = (D_T, D_V, D_S)\) that bundles the full manuscript, figure-caption pairs, and supplementary materials with curated talking points. A dissemination target \((\mathbb{T}_P, \mathbb{T}_A)\) specifies the delivery platform and intended audience persona, grounding tone, cadence, and visual affordances.

\[\hat{P} = \operatorname*{argmax}\limits_{P} \Pr\big(P \mid \mathbb{D}, \mathbb{T}_P, \mathbb{T}_A\big)\]

The generator seeks a Pareto-efficient post \(\hat{P}\) that negotiates competing objectives. AutoPR scores candidate posts with a triad of metrics that reward trustworthy science communication while honoring channel norms.

  • Fidelity \(\mathcal{S}_{\text{Fidelity}}\) checks factual accuracy, coverage of core claims, and preservation of scientific nuance against \(\mathbb{D}\).
  • Alignment \(\mathcal{S}_{\text{Align}}\) measures stylistic fit, pacing, and media mix relative to platform expectations \(\mathbb{T}_P\).
  • Engagement \(\mathcal{S}_{\text{Engage}}\) estimates resonance with the target audience \(\mathbb{T}_A\) via hooks, accessibility cues, and call-to-action design.

\[\vec{F}(\hat{P}) = \alpha_1 \mathcal{S}_{\text{Fidelity}}(\hat{P} \mid \mathbb{D}) + \alpha_2 \mathcal{S}_{\text{Align}}(\hat{P} \mid \mathbb{T}_P) + \alpha_3 \mathcal{S}_{\text{Engage}}(\hat{P} \mid \mathbb{T}_A)\]

Balancing these weighted objectives yields a frontier of diverse promotional narratives, ranging from expert-facing summaries to public-friendly explainers, ready for downstream adaptation by PRAgent.

AutoPR task overview

PRBench Benchmark

PRBench evaluates automatic promotion systems with expert-curated scores that cover the full lifecycle of science communication.

  • Fidelity checks that promotional posts stay truthful to the source paper:
    • Authorship & Title Accuracy verifies correct attribution and prominent mention of researchers and paper title.
    • Weighted Factual Checklist scores coverage of expert-authored claims, emphasizing high-weight scientific contributions.
  • Alignment captures whether content fits platform conventions and channel strategy:
    • Contextual Relevance examines tone, pacing, and jargon against audience expectations.
    • Visual-Text Integration inspects how captions and imagery reinforce each other for the target feed.
    • Hashtag & Mention Strategy reviews tagging choices that drive discoverability.
    • Platform Interest pairwise comparisons estimate which post a platform-native audience would prefer.
  • Engagement tests the persuasive strength of the message:
    • Engagement Hook Strength looks at the opening line’s ability to grab attention.
    • Logical Attractiveness measures narrative clarity and flow.
    • Visual Attractiveness rates originality and information density of shared media.
    • Call-To-Action Score checks whether readers are guided to meaningful follow-ups.
    • Professional Interest and Broader Interest pairwise wins gauge appeal to expert and public audiences respectively.

Every sub-score is assigned by three trained annotators, with disagreements reconciled to build a dependable gold standard for PRBench and its stratified PRBench-Core subset.

PRAgent Framework

PRAgent is a modular agentic workflow spanning three stages:

  1. Content Extraction: hierarchical parsing condenses long-form papers and pairs visuals with captions.
  2. Multi-Agent Synthesis: specialized agents draft logical narratives, analyze visuals, and enrich tone.
  3. Platform Adaptation: orchestration agents tailor hooks, hashtags, and pacing to each channel.
PRAgent pipeline

Evaluation and Insights

What is LLM's limitations for academic promotion generation?

Direct prompting across the 512-sample PRBench benchmark shows that frontier LLMs still fall short of human-crafted promotion posts. The companion PRBench-Core subset (128 stratified samples) exposes the same failure patterns while enabling rapid iteration.

  • Factual gaps dominate. Numerical, methodological, and terminology errors make up over 90% of inaccuracies for models such as Qwen-2.5-VL-32B-Ins., revealing difficulty in preserving the precise contributions of each paper.
  • Engagement tactics plateau. LLMs repeatedly choose generic hook strategies instead of tailoring flows to audience intent, leaving logical and visual attractiveness below human-authored references.
  • Platform cues stay off-target. Generated hashtags share only 0.0347 Jaccard similarity with the originals, indicating shallow alignment with the discovery mechanisms of expert communities.

To score these behaviors at scale we adopt Qwen-2.5-VL-72B-Inst. as the automatic judge. Its decisions correlate with human ratings up to 0.98 on factual accuracy and 0.75 on authorship fidelity, giving us reliable signal across PRBench and PRBench-Core.

Error analysis on PRBench covering factual drift, hook choices, and hashtag mismatch

LLM failures on PRBench concentrate on missing precise facts, over-using generic engagement hooks, and diverging from human hashtag choices.

Agent can improve automatic promotion quality.

PRAgent orchestrates content extraction, multi-agent synthesis, and platform adaptation to remediate the above shortcomings. We evaluate 12 competitive LLM backbones on PRBench-Core for fast comparison and confirm the same ordering on the full 512-sample PRBench.

  • Cross-model gains. PRAgent adds at least 7.15 percentage points over direct prompting across all backbones and exceeds 20 points with GPT-5-mini, lifting average scores into the mid-to-high 70s.
  • Each stage matters. Ablations on Qwen-2.5-VL-32B-Ins. show platform adaptation drives alignment from 71.36 to 79.38 and structured content extraction raises fidelity from 66.38 to 70.76.
  • Scales to full PRBench. The same ordering observed on PRBench-Core carries over to the 512-sample benchmark, showing that PRAgent generalizes beyond the rapid subset.

Top: PRAgent lifts fidelity, engagement, and alignment metrics over direct prompting for every backbone on PRBench-Core. Bottom: The same ordering holds on the full 512-sample PRBench benchmark.

Real-world deployment on RedNote.

To validate practicality we ran a 10-day RedNote study with paired accounts posting identical daily papers. PRAgent powered one account, while the baseline relied on direct prompting with the same GPT-5 backbone.

  • Sustained engagement lift. PRAgent posts consistently outperformed the baseline on likes, saves, and shares for every paper in the study.
  • Audience growth. Total watch time increased by 604 percent and profile visitors by 575 percent relative to the direct prompting account.
  • Stable workflow. Both accounts consumed identical model budgets, highlighting that gains come from workflow design rather than brute-force scaling.
RedNote A/B test comparing PRAgent with direct prompting

Real-world RedNote deployment: PRAgent drives higher interactions per paper and sustained audience growth compared to direct prompting.

Citation

If you find AutoPR helpful, please cite our work.

@article{chen2025autopr,
  title={AutoPR: Let's Automate Your Academic Promotion!},
  author={Chen, Qiguang and Yan, Zheng and Yang, Mingda and Qin, Libo and Yuan, Yixin and Li, Hanjing and Liu, Jinhao and Ji, Yiyan and Peng, Dengyun and Guan, Jiannan and Hu, Mengkang and Du, Yantao and Che, Wanxiang},
  journal={Manuscript},
  year={2025},
  note={\url{https://autopr.github.io}}
}
          
Download BibTeX

Contact

Please create Github issues here or email Qiguang Chen or Zheng Yan , or open up an issue on Github if you have any questions or suggestions.

Acknowledgement

This website is adapted from Nerfies, LLaVA-RLHF, and M3CoT, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Usage and License Notices: The data, code and checkpoints are intended and licensed for research use only. Please ensure compliance with the corresponding licenses when using PRBench, PRAgent, or other AutoPR assets.