AI4EC logo
Location:Home - Explore - IKKEM Intelligent Computing Center

IKKEM Intelligent Computing Center

Overview

IKKEM Intelligent Computing Center was established and put into operation in 2022. It is equipped with advanced liquid cooling technology for energy efficiency and features state-of-the-art computing hardware, including 390 CPU nodes, 6 GPU nodes, and 2 fat nodes, supporting model training, simulation, and large-scale scientific computing.

High Performance hardware:

  • CPU Compute Nodes: 2 x Intel Xeon Gold 6338(32cores, 2.0GHz), 256GB DDR4-3200MHz memory
  • GPU Compute Nodes: 8 x Nvidia A100 80GB SXM,1.5TB DDR4-3200MHz memory
  • Fat Nodes: 2 x Intel Xeon Gold 8358(32cores, 2.6GHz), 2TB DDR4-3200MHz

*:Data storage capacity up to 3 PB, with data read/write speeds greater than 50 GB/s

Professional Intelligent Computing Services:

The AI4EC Lab team, focusing on energy chemistry materials, has developed a specialized heterogeneous (CPU/GPU/non-Von Neumann architecture) integrated computing platform based on the Tan Kah Kee Supercomputing Center. This platform supports:

  • Out-of-the-box electrochemical intelligent scientific computing software
  • Customized computing environments and supporting algorithm deployment
  • High-throughput data production, storage, retrieval, and management
  • Training and application of specialized scientific intelligent models
Usage

Intelligence center computing power needs, please contact ikkemhpc@xmu.edu.cn,Please refer to the User Manual for the application process. Electrochemistry professional intelligent computing services, please contact ai4ec@xmu.edu.cn

Attachment: User Manual

Q&A
  1. Charging standard
    The total cost of a platform cluster is the sum of the cost of CPU, GPU, and storage. CPU resources are charged by core hour, and GPU resources are charged by card hour. Please send email to ikkemhpc@xmu.edu.cn for more information.
  2. What is the maximum running time of a single job?
    To view the MaxTime parameter, run the scontrol show partition command
  3. Why is my job run result node_fail? What should I do?
    Node_fail Indicates that a job fails due to a compute node fault
  4. Why is my program terminated on the login node, can I run the program on the login node?
    Login nodes are used for lightweight work such as file editing, job submission, compilation of small applications, and file download. Computationally intensive tasks, such as scientific computing and large file verification, occupy more computing resources and affect other users' normal use. To ensure user experience, the task detection service is configured on the login node to detect and kill tasks that occupy resources abnormally on the login node.
  5. How do I install software on a cluster?
    When installing software on the cluster, determine the following conditions: If the software is commercial software, obtain the right to use the software and install it. If common open source software is used, check whether the cluster has been installed based on the application software documentation.
    1) If it is not installed, consider whether it can be installed using the conda method;
    2) Consider using the source code installation in your home directory. If you have any problems, please send the reproducible steps to the hpc mailbox for help;
    3) Software can also be installed using containers;
    4) We will also evaluate commonly used open-source software for global deployment. Feel free to contact us via email.
  6. Does the cluster offer commercial software?
    Not at the moment.
  7. How can a regular user use `sudo` to install software?
    Unlike dedicated personal computers and workstations, high-performance computing users share both hardware and software resources. Using `sudo` can potentially affect other users' programs and data, so ordinary users are prohibited from using `sudo`. Typically, ordinary users can install and use software within their home directories without `sudo`, and software installed with `sudo` is likely to be incorrectly installed on the local file system and unable to run on compute nodes. Please refer to the software modules provided by the current cluster or inform us of the software you need to install via the HPC email. Ordinary users can also use containers for installation, as users within containers have "simulated root privileges."
  8. How to acknowledge the IKKEM Intelligent Computing Center in a paper?
    The acknowledgment template is as follows. We welcome everyone to share any high-quality results received via email with us.
    (Chinese)本论文的计算结果得到了嘉庚创新实验室智算中心的支持和帮助;
    (English)The calculation results of this paper have been supported and helped by ikkem Intelligent Computing Center
  9. Is there a policy for computing time rewards?
    Not at the moment.
  10. What should you do if you forget your password or lose your key?
    If a user forgets their password or loses their key, they can send a password reset request to ikkemhpc@xmu.edu.cn using the contact email provided during application or contact the administrator in the WeChat service group for a reset.
  11. If you encounter any issues during use, please refer to the 'User Manual' or contact us via email.