LLM & HPC:Benchmarking DeepSeek's Performance in High-Performance Computing Tasks

Noujoud Nader,Patrick Diehl,Steven R. Brandt,Hartmut Kaiser

Published 2025 in Information Security Conference

ABSTRACT

Large Language Models (LLMs), such as GPT-4 and DeepSeek, have been applied to a wide range of domains in software engineering. However, their potential in the context of High-Performance Computing (HPC) much remains to be explored. This paper evaluates how well DeepSeek, a recent LLM, performs in generating a set of HPC benchmark codes: a conjugate gradient solver, the parallel heat equation, parallel matrix multiplication, DGEMM, and the STREAM triad operation. We analyze DeepSeek's code generation capabilities for traditional HPC languages like Cpp, Fortran, Julia and Python. The evaluation includes testing for code correctness, performance, and scaling across different configurations and matrix sizes. We also provide a detailed comparison between DeepSeek and another widely used tool: GPT-4. Our results demonstrate that while DeepSeek generates functional code for HPC tasks, it lags behind GPT-4, in terms of scalability and execution efficiency of the generated code.

PUBLICATION RECORD

Publication year
2025
Venue
Information Security Conference
Publication date
2025-03-15
Fields of study
Computer Science
Identifiers
DOI 10.1007/978-3-032-07612-0_48 arXiv 2504.03665
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning
2025cited by this paper
LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming Languages
2025cited by this paper
China’s cheap, open AI model DeepSeek thrills scientists
2025cited by this paper
Evaluating AI-generated code for C++, Fortran, Go, Java, Julia, Matlab, Python, R, and Rust
2024influential reference
DeepSeek-V3 Technical Report
2024influential reference
DeepSeek-Coder: When the Large Language Model Meets Programming - The Rise of Code Intelligence
2024influential reference
The Landscape and Challenges of HPC Research and LLMs
2024cited by this paper
chatHPC: Empowering HPC users with large language models
2024cited by this paper
Large language model evaluation for high‐performance computing software development
2024cited by this paper
Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation
2023cited by this paper
LM4HPC: Towards Effective Language Model Application in High-Performance Computing
2023cited by this paper
Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation
2023cited by this paper
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java
2023cited by this paper
Scope is all you need: Transforming LLMs for HPC Code
2023cited by this paper
Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C++
2023cited by this paper
LLM4VV: Developing LLM-Driven Testsuite for Compiler Validation
2023cited by this paper
HPC-GPT: Integrating Large Language Model for High-Performance Computing
2023cited by this paper
Evaluating Large Language Models Trained on Code
2021cited by this paper
Applicability of the software cost model COCOMO II to HPC projects
2017cited by this paper
google,我,萨娜
2006cited by this paper
A review of software surveys on software effort estimation
2003cited by this paper
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
1994cited by this paper
Software Engineering Economics
1993cited by this paper

CITED BY

LLM Benchmarking with LLaMA2: Evaluating Code Development Performance Across Multiple Programming Languages
2025cites this paper
LLM4VV: Evaluating Cutting-Edge LLMs for Generation and Evaluation of Directive-Based Parallel Programming Model Compiler Tests
2025cites this paper
Can LLMs Find Bugs in Code? An Evaluation from Beginner Errors to Security Vulnerabilities in Python and C++
2025cites this paper
LLM-HPC++: Evaluating LLM-Generated Modern C++ and MPI+OpenMP Codes for Scalable Mandelbrot Set Computation
2025cites this paper