mercari AI

Blog

mercari AI’s research “LLMs as an Interactive Database Interface for Designing Large Queries” accepted to HILDA Workshop in SIGMOD 2024

Overview

We are pleased to announce that the paper "LLMs as an Interactive Database Interface for Designing Large Queries" by Engineer Yilin Li and Deddy Jobson, of the Mercaril AI team has been accepted for HILDA Workshop of the international conference SIGMOD 2024 in the field of management of data.

SIGMOD 2024 is one of the most prestigious international conferences in the field of data management, held annually by researchers from around the world. This year's 49th, was held in Santiago, Chile, from June 9 to 14, 2024.

 Key points of presentation

  • Current Text2SQL tends to produce inexact queries when scaled to databases the size of what we have in Mercari.
  • We build a system to incorporate human feedback to iteratively finetune SQL queries.
  • We plan on leveraging knowledge graphs in the next iteration for better schema linking.

Background

In large companies, writing SQL queries can be a time-consuming process. The required data can be scattered across a plethora of tables, which constantly get updated, making it difficult for the layperson to create queries for one-shot data analytics tasks.

Summary of paper

To improve data democratization in Mercari, we implement a solution using LLMs for text 2 SQL. While a number of existing methods exist, they tend to suffer from errors that are hard to catch. For that reason, instead of treating Text2SQL as a one-shot generation process, we consider it to be an interactive problem with a human in the loop. We build a system that involves validation stages and human feedback to polish the query before returning it to the end user. Our method results in fewer errors and, therefore, more useful queries.

About Customer Understanding Team

The Customer Understanding team is dedicated to analyzing the behavior of customers of Mercari Group and proposing strategies to optimize their journey and lifetime value. We employ a variety of technologies like statistics, mathematical optimization, large language models, etc.

Author

ML Engineer

Deddy Jobson

  • Causal Inference
  • Marketing Strategy
  • Statistical Machine Learning

Deddy is a ML Engineer who joined Mercari as a new graduate. He is responsible for analyzing marketing campaigns using statistical models and mathematical optimization. On top of predicting which users respond positively or negatively to campaigns, Deddy also provides detailed explanations for those behaviors. He then shares those insights with stakeholders to discuss ways to improve future campaigns.

ML Engineer

Yilin LI

  • Data Science
  • Marketing Strategy
  • Mathematical Optimization

Yilin is a ML Engineer at Mercari Japan and is a member of the Marketing Data Science Team. Her background is in applying modeling, mathematical optimization, and other techniques to marketing strategies. she has experience in designing solutions to improve profit by integrating ML and mathematical optimization, as well as analyzing user behavior to improve user engagement through personalized communication.