Abacus Semiconductor Corporation

Abacus Semiconductor Corporation is a fabless semiconductor company building next-generation solutions that overcome the inherent limitations of the von Neumann and Harvard computing architectures found in all of today’s incumbent computing systems by including the Kloth Architecture.

The resulting advantage of this novel computing architecture is at least an order of magnitude improvement in performance, power consumption (and heat generation), or both.

Our patent-protected Heterogeneous Accelerated Compute™ technology removes the fundamental bottlenecks affecting inter-processor communication and memory access in existing systems, enabling server manufacturers to build supercomputers for AI and beyond with unprecedented scalability. In addition, Abacus Semi’s unique architecture provides intrinsic cybersecurity features that make these widely distributed servers impervious to even the most advanced cyberthreats.

There is also consistent agreement across the industry that a Hybrid Cloud approach has become essential, where new investment into on-premises AI data centers and high-performance computing including AI and proxying at the (carrier) edge is needed to power proprietary AI foundational models trained on private datasets for secure internal operations and product development.

Industry consensus is that we have reached the end of the usefulness of PCIe and of legacy memory interfaces such as DDR5, even if High Bandwidth Memory (HBM) stacks are now integrated into processors and GPGPUs. Several attempts have been made to delay the inevitable. NVidia has introduced NVLink to connect its GPGPUs directly, circumventing the existing GPGPU to GPGPU communication via CPU and NIC bottlenecks. AMD, Intel and others countered with UALink that tries to achieve the same. The memory size and shared access bottlenecks are unsuccessfully addressed by CXL. However, none of these technologies address or solve the fundamental underlying problem in connectivity. Abacus Semi is solving these challenges with the patent-pending Kloth Architecture for processors and accelerators.

Server-on-a-Chip

Abacus Semiconductor has developed a Server-on-a-Chip that makes building a server cheaper, allows for higher integration, and that follows the same principles that have been successful before, namely in smart phones and in tablets (and mainframes of the past). We believe that offload engines in hardware and with proper firmware support provide for a better energy-efficiency of compute than a homogeneous cluster of host CPUs. This processor can be used for web services, file services and high-transaction applications as well as in traditional (i.e. disk-based) and in-memory database applications. We have added virtualization hardware to the CPU cores and the IOMMU so that they can be used in fully virtualized environments. It can be used in any LAMP environment without recompiling of code, and as a file server or the core of a storage appliance. The Server-on-a-Chip supports both DDR5 DIMMs and our HRAM.

HRAM

Today's processors and GPGPUs rely on the same outdated concept that we have seen for the past 30 years. In essence, the memory subsystem has no built-in smarts. As a result, the performance, security and shareability of the memory subsystem is very limited. The CPU's or GPGPU's DRAM Controller has to manage all memory transactions, and as such, these memory controllers are very specific to the type of memory they were developed for. We believe that this is a mistake. Memory is just a resource, and it should be managed by the memory subsystem, and not by the CPU or GPGPU. We call our solution Heterogeneous Random Access Memory or HRAM. It consists of multiple types and hierarchies of memory, and the HRAM as an intelligent multi-homed memory subsystem contains all controllers needed. There is no need for any kind of memory controller to be present in the processor or accelerator.

Application Processor

This processor family is targeted towards general-purpose processing that requires cache-coherent Terabyte-size main memory. Its main target are the orchestration of tasks that are either distributed to other processors or specific accelerators in large-scale applications. It is intended to serve as the main processor that controls application program flow and distributes tasks to other processors or accelerators in HPC and in Big Data as well as in Artificial Intelligence (AI) and Machine Learning (ML) workloads. For the time being, its hardware will be identical to the Database Processor, but it will ship with different firmware.

Database Processor

This processor is targeted towards ultra high-frequency transaction and large-scale database applications, web services, and all other integer-only applications that require cache-coherent Terabyte-size main memory. Large-scale in-memory databases such as ScyllaDB benefit from the internal and external bandwidth of this processor, the multi-homing of the memory, and the scalability of the solution. It is a processor and an accelerator in one, and for the time being, its hardware will be identical to the Application Processor. It will use a different firmware than the Application Processor.

Math Processor

We believe that the current math coprocessor concept does not work as effectively and efficiently as it could. While GPGPUs have helped tremendously, there are inherent limitations in GPGPU-accelerated compute. Among others, these are effectively all SIMD engines, and while there are a few thousand of them on a single die, they are not optimized for many mathematical operations that reflect the physical problems that HPC users need to solve. Most GPGPU compute today is predicated on CUDA, and that is a captive and proprietary solution. We prefer open source programming frameworks such as openCL and openACC. In our experience, open source frameworks lead to better quality of code, greater width and depth of solutions, and ultimately better adoption. We have made life easier for programmers and for users with our built-in vector, matrix and tensor math functions as well as for all transforms, such as Fourier transforms.