sage publishing

Knowledge Graphs as a catalyst for change

GCP

Neo4j

Data Engineering

Industry

Publishing

Project Type

Data Solutions

Technologies

Neo4j, Python, Jupyter Notebooks, GCP, Docker, Google Cloud Storage, SSL

OpenCredo-Icon: Quotation Marks -Bright Coral

“Well regarded and respected in the graph industry, we approached OpenCredo to aid us on our journey of detecting and identifying potential fraudulent forms of authorships within our publications. We have been delighted with the outcome and the partnership to date. Extremely professional, we received an end-to-end service which included design, development and delivery of a production-ready cloud based POC able to leverage and explore data through key graph and ML algorithms. “

Helen King,
Director of Transformation

Detecting author misconduct with the power of knowledge

SAGE Publishing is a global academic publisher driven by the belief that social and behavioural science has the power to improve society. SAGE produces educational resources that support instructors to prepare the citizens, policy makers, educators and researchers of the future. They successfully publish more than 1,000 journals and 900 new books globally each year.

OpenCredo was approached by SAGE to help them on their journey to become more data driven by gaining insights into their creators, customers and - ultimately - their business.

Publishing companies are actively battling industrialised cheating, where some companies provide systemic production of falsified research for purchase. Authors of research papers are required to contribute to the research described in the paper, thus paying for an authorship slot is a form of misconduct.

SAGE was looking for a technology partner to develop a solution that would identify these fraudulent forms of authorships.

Why OpenCredo?

OpenCredo has deep expertise in the data space, from defining the Data Strategy to the design and implementation of end-to-end Data & ML solutions. We’ve helped various clients by setting up secure and scalable data engineering pipelines and data science development environments to experiment faster and provide better actionable insights.

Using graph technologies to uncover hidden links

OpenCredo worked to deliver a solution that would enable the SAGE team to work collaboratively on a secure data science platform. Here they could explore the relationships between authors and published papers. Our team of consultants achieved this by modelling the data set and ingesting it into a Neo4j graph database to create a knowledge graph.

‍

The GCP-based solution consisted of an automated ingestion pipeline capable of running graph based algorithms for fraud detection. The team ingested 1 TB of data and deployed it onto the fully secured infrastructure to store and query the data for further analysis.

‍

Over 300 million nodes and 2 billion relationships were analysed using Jupyter Notebooks and different graph traversal algorithms, leveraging both similarity and community detection.

Becoming a data driven organisation

During the delivery, SAGE’s data science team were also upskilled, enabling them to continue the efforts of identifying fraudulent authors.

Through this partnership, different departments in the organisation were able to gain a deeper understanding of the business and are starting to discuss how to make data central to all their decision making. SAGE are now taking this further by building their own in-house data product teams.

Through this partnership, SAGE Publishing acquired:

Production ready

POC that can detect potential authorship misconduct through Graph and ML algorithms, which SAGE is now looking to take into production.

Digital transformation

through knowledge and understanding gained across the organisation. Sage is now taking steps further to becoming a data driven organisation by building in-house data product teams.

Increased data literacy

with different departments in the organisation gaining more understanding and an appetite to learn about the data they own and enrich it via external resources using graph algorithms.

Upskilling and knowledge

transferring across their data science team, enabling them to continue the efforts in detecting authorship misconducts in-house.

"OpenCredo's collaboration was transformative for Livy AI. Their creative vision, engineering excellence, and seamless communication elevated our project, making them the ideal partner for turning innovative ideas into reality."

Jonathan Browne

Founder & CEO

OpenCredo-Case Study Banner: National Journal

"OpenCredo helped us integrate that data, seamlessly flow it into our data visualization tool, and deal with a massive amount of data duplication issues. The experience of working with OpenCredo couldn’t have been better – they were highly professional, organized, and supremely competent in delivering this work to us.”

Luke Hartig

Executive Director

"Since we’re a data platform provider, it’s essential we are perceived as best in class for CSR and compliance. Average would never be good enough. We wanted to move to the next level and offer Big Data services."

José Copovi-King,

Director of Products and Services

Looking for a hands-on software delivery partner?

Book in a quick 20 minute discovery call with our consultants to discuss your specific project and objectives.

Book now

OpenCredo-Photo-Nicki Watt & Consultants