SALV

Abstract

The remarkable progress of Large Language Models (LLMs) presents promising opportunities for Verilog code generation which is significantly important for automated circuit design. The lacking of meaningful functional rewards hinders the preference optimization based on Reinforcement Learning (RL) for producing functionally correct Verilog code. In this paper, we propose Signal-Aware Learning for Verilog code generation (QiMeng-SALV) by leveraging code segments of functionally correct output signal to optimize RL training. Considering Verilog code specifies the structural interconnection of hardware gates and wires so that different output signals are independent, the key insight of QiMeng-SALV is to extract verified signal-aware implementations in partially incorrect modules, so as to enhance the extraction of meaningful functional rewards. Roughly, we verify the functional correctness of signals in generated module by comparing with that of reference module in the training data. Then abstract syntax tree (AST) is employed to identify signal-aware code segments which can provide meaningful functional rewards from erroneous modules. Finally, we introduce signal-aware DPO which is optimized on the correct signal-level code segments, thereby preventing noise and interference from incorrect signals. The proposed QiMeng-SALV underscores the paradigm shift from conventional module-level to fine-grained signal-level optimization in Verilog code generation, addressing the issue of insufficient functional rewards. Experiments demonstrate that our method achieves state-of-the-art performance on VerilogEval and RTLLM, with a 7B parameter model matching the performance of the DeepSeek v3 671B model and significantly outperforming the leading open-source model CodeV trained on the same dataset.

Motivation

Due to Verilog's characteristics of relatively independent and parallel signals, we can extract the code implementation of a specific signal through AST. Here there are two output signals, we can separately extract their related code implementations. Among them, the implementation of signal a is incorrect while signal d is correctly implemented. We can utilize the code implementation of the correct signal d to provide functional correctness rewards for RL.

Overview

(a) The proposed QiMeng-SALV comprises three stage: signal-aware verification, signal-aware code extraction, and signal-aware DPO training.
(b) In signal-aware verification stage, verification is performed by analyzing output signal discrepancies between generated modules and their reference counterparts, allowing precise identification of correctly functioning output signals.
(c) In signal-aware code extraction stage, AST parsing reveals critical dependencies between output signals and intermediate signals to obtain relevant preferred and dispreferred code segments pertinent to the contrastive signals.

Main Results

We conduct comprehensive evaluations on the VerilogEval (including VerilogEval1.0 and VerilogEval2.0) and RTLLM benchmarks (including RTLLM v1.1 and RTLLM v2.0). For baseline comparisons, we conduct a comprehensive comparison between our proposed method and several baseline approaches, categorized into three groups:
1. General-purpose Foundation Models: GPT-3.5, GPT-4o, GPT-4, and DeepSeek-v3.
2. General Code Models: CodeQwen1.5, Qwen2.5 Coder Instruct and Deepseek Coder.
3. Domain-Specialized Verilog Models: Including RTL Coder, CodeV, Origen, VeriSeek and VeriPrefer.
Comparison results in Table 1 and Table 2 show that QiMeng-SALV establishes new state-of-the-art results across both benchmarks in the open-source domain.

As shown in Table 1, in the VerilogEval evaluations, QiMeng-SALV demonstrates leading performance in both specification understanding and code completion tasks among open-source solutions, achieving performance comparable to DeepSeek-V3 on the VerilogEval1.0 Machine benchmark, and outperforming GPT-4o on the VerilogEval2.0 completion task.

As shown in Table 2, QiMeng-SALV achieves a remarkable 62.6% functional pass@1 accuracy on the RTLLM v1.1 benchmark and 62.0% on RTLLM v2.0 with merely 7B parameters, significantly exceeding all existing open-source alternatives and rivaling the performance of DeepSeek-v3, a 671B parameter model. Impressively, its functional pass@10 accuracy reaches 81.1% on RTLLM v1.1, surpassing DeepSeek-v3's 72.4%.

BibTeX


@misc{zhang2025qimengsalvsignalawarelearningverilog,
  title={QiMeng-SALV: Signal-Aware Learning for Verilog Code Generation}, 
  author={Yang Zhang and Rui Zhang and Jiaming Guo and Lei Huang and Di Huang and Yunpu Zhao and Shuyao Cheng and Pengwei Jin and Chongxiao Li and Zidong Du and Xing Hu and Qi Guo and Yunji Chen},
  year={2025},
  eprint={2510.19296},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2510.19296}, 
}