Research Article |
Corresponding author: B. M. Shilpa ( shilpabm244@gmail.com ) © 2024 B. M. Shilpa, Sathisha K. Shet.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Shilpa BM, Shet SK (2024) Optimized register level transfer mechanism for ASIC-based memscomputing. Modern Electronic Materials 10(1): 19-28. https://doi.org/10.3897/j.moem.10.1.113631
|
Micro-electromechanical system-based semiconductor sensor systems typically need a reliable on-chip analog readout circuit to identify and process changes in physical properties. To improve the mobility, robustness, reliability and performance (in terms of power consumption and speed) of the sensor system, the remaining parts of the post-processing logic, such as the digital data processor (or part of it), can also integrated into the chip. Cost and integration are always important considerations in both analog and digital designs, but they become even more important; the subsystem of data processing must be squeezed into a very small space near the structure of the micro-electromechanical system (MEMS) sensor. The paper introduces a new Buffalo-based register-transfer level (BRTL) method that aims to improve the efficiency and reliability of the system of digital data processing for MEMS sensor post-processing algorithm. Initially, the memcomputing system was designed with the needed functional processing elements. Then, a novel Buffalo-based register-transfer level design is modelled in the MEMS architecture. The digital data transmission process is performed to estimate the reliability of the proposed BRTL model. Finally, the performance of the proposed model can be validated.
application specific integrated circuit (ASIC), Buffalo-based register transfer level, memcomputing, functional verification
Memcomputing was one of the technologies, and it would provide noteworthy enhancements for the solution of hard optimization issues in contrast to conventional algorithmic techniques [
The 3-SAT formula was built by the seminal work of Bearden [
Also, ASIC designs would diminish power consumption and make it suitable for mobile and embedded systems. In the hardware design, the implementation of data plane functionality in an ASIC [
The key contribution of this present study is defined as follows,
– initially, the memcomputing system was designed with the needed functional processing elements;
– then, a novel Buffalo-based register-transfer level design (BRTLD) is modelled in the MEMS architecture;
– the digital data transmission process is performed to estimate the reliability of the proposed RTL model;
– finally, the chief metrics like delay, memory usage and power/energy utilization were measured and compared with other models.
The present work is presented in the form of related work in the second section. The problems that the conventional method faces are exposed in the section three. Then, the solution for the defined problem is elaborated in section four. The validated results for the novel solution are discussed in the section five. The end of the research paper was concluded with section six.
The memristive neuromorphic system included with the chips is implemented through integrated resistant switched memories on the CMOS hardware. The term is neuromorphic has been utilized to recognize analog, digital and mixed models of analog and digital integration design. This network was essential to construct the neurons and artificial synapses that were able to impersonate the complexity term of the biological counterparts. Since this network was inherently volatile, binary and poor scaled. The devices depended on the novel physical principles that were necessary for replicating the biological synapses and neurons in this network, which was consistent with the high-density ultra-connections and complexity processes [
The implementation was not available in formal technologies using memristors in circuits, which was one of the major problems. So, to overcome those issues, a Memristor Cellular Non-linear Network (M-CNN) was used to implement the digital realization of CNN by memristors. In this network, the process could balance the time and accuracy in the trade-off manner. For instance, several examples are based on this network for diverse applications such as compression of data; Fast Fourier Transform (FFT) computation, image processing algorithms and chaotic equation estimation or conversion of analog to digital circuits. Also, the consumption of energy, complexity of the circuit and overhead of area for a minimum number of bits were decreased. However, the time was necessary to perform the increasing operation with several bits [
The primary network unit, an oscillator circuit-based memristor, will be illustrated. To form the network, couple the oscillators. Then, the network was modified to compensate for the unbalanced connections from coupled node to coupled node and introduce the variability in the device-to-device process, and the original array would be extended. To implement the reconfigured network, insert the transistors in series provided with the coupled capacitors that allowed switching on and off the connections between the cells. For example, organise the cell in the matrix configuration while selecting the respective control signals through multiplexers by ad-hoc address code. Meanwhile, reprogramming the connection among oscillators based on the electrical signals allowed the implementation of various computing phenomena [
The ant colony optimization (ACO) was designed through the memristor crossbar array. Initially, the non-linear voltage control memristor mechanism with the relaxation term and then the optimization process was performed with the padding strategy. Furthermore, the memristor crossbar array was designed with the external control circuits with the ACO strategy that provided high parallel computing and device density. The threshold for generating the edges was selected directly as the mean of the final conductance matrix for deploying the designing process of the memcomputing through the memristor crossbar array [
In-memory computing would eliminate the movement of data among the physical separable computing and memory units in traditional computers. The quadrative Eigenvalue problem was attained due to the transfer function of the circuit, whereas the minimum Eigenvalue value is denoted as the dominant pole that dominates the circuit's response time. Also, the size of the problem would affect the circuit response. The computational speed of the circuit was enhanced based on parameter setting and performs the process faster. To speed up the computation of the circuit, the above parameter was operated synergistically. For instance, this parameter was used to determine the computational error and power consumed in the circuit. This circuit was crucial to develop the in-memory machine learning (ML) accelerator applications for designing the memcomputing [
Memcomputing machines operate in continuous time, and the simulation process on modern computers is required to describe the discretization of time. The physical time was said to be continuous time; also, the discrete-time was not the physical quantity of the physical time. The UMM was said to be the Turing-complete that simulates the universal Turing machine (UTM). The UMM consists of analog and digital machines. Meanwhile, analog memcomputing machines had remarkable computation power, and analog systems could not plan for the scalability process, as the size of growing the machines required growing resources to obtain the same accuracy. The DMM was scalable and focused on the dissertation process. Moreover, if the functional operations are not optimised, then it has recorded high power consumption and area usage.
The difficulties of the traditional method are described in Fig.
The first step was to make the idea into the chip in the ASIC design. The specification was provided with goals, constraints, functionality, speed and power, and technology constraints such as size and shape. The next step of this design was structural and functional description. The functionality should match the specifications. Here, a novel Buffalo-based register-transfer level design (BRTLD) is planned to be designed for controlling the ASIC design functional features. Here, the presence of the Buffalo best solution has helped to keep the MEMS architecture in optimal condition by controlling the data overflow. The proposed architecture is defined in Fig.
The proposed methodology developed a novel buffalo-based register transfer level design method aimed at designing the optimized MEMCOMP for digital applications, which is related to the mems sensor's post-processing algorithm. Here, buffalo optimization gives the best solution in the proposed method.
To achieve an effective combination of BRTL and the rapid design steps, the novel description language known as AMDL (Algorithmic Microarchitecture Description Language) was developed along with the new pre-synthesis model. AMDL is the register transfer level (RTL) language that is tailored for describing data paths that include algorithmic constructs, which improves the design time and readability. The pre-synthesis algorithm transforms the AMDL model into the structural RTL model represented in the VHDL, which can be synthesized into the higher quality gate level description using actual RTL tools, as defined in Fig.
A well-designed language-based approach can provide a high-level design method and general-purpose algorithmic language to increase the efficiency of hand-optimized BRTL designs. To ensure that high-level optimization is possible beyond the nature of the algorithm, it is important for the designer to have full visibility of the architecture. AMDL tasks contain correct value and left-value expressions that refer to the specific BRTL functional units and data storage elements, which are listed in the resource declaration section of the description.
D (x) = x1, x2, x3, …, xn. (1)
The digital data initialization for the design by using the Eq. (1). Where D (x) is denoted as the objective function variable. The binary data was initialized. The AMDL model does not include the accurate functionalities of all operators. The AMDL designers understand the desired behaviour of the design and create an AMDL description that captures this behaviour accordingly. In most cases, an operator is only the library unit, but if its functionality is more complex, the VHDL code can be used to describe the design during the pre-synthesis phase.
Because the designer manually executes scheduling, binding and resource allocation, AMDL is not a programming language in itself but rather a way of writing code that follows algorithmic principles of the BRTL model that describes a higher abstraction level within BRTL. The conversion of AMDL to BRTL is straightforward during the pre-synthesis stage due to the AMDL language’s detailed nature. Functional verification is important to a design's success. For the design, there are often billions or millions of potential test cases. The function verification can be done by using the Eq. (2).
F = D (x) + T1(Ts – Fs) + T2(Ts – Fs). (2)
Where F is denoted as the functional verification variable, D (x) is the objective function variables, Ts denoted as the true specification, Fs denoted as the false specification, T1 and T2 are test one and test 2. By using the equation, verify that the specific model implements the specification correctly. The datapath is created from the task statements in the AMDL model. The AMDL designer created the BRTL netlist by explicitly stating the connections between the logic elements and storage. The pre-synthesis algorithm identifies and resolves situations where multiple resources are competing for the same input by inserting multiplexers into the netlist as needed. After the BRTL netlist has been generated, the model generator algorithm creates HDL models of the datapath resource and combines them into a single VHDL entity and architecture. The datapath should be complemented by their VHDL model as well if the AMDL model includes a design-specific operator.
The control unit generation is based on the specific mapping regulation to determine the control states needed, and then the distinct AMDL control structures are implemented. AMDL provides additional control structures that improve the language’s expression of concurrency beyond the fundamental elements of structured programming like decisions, loops, statements and sequences. The logic equivalence check can be done by using the Eq. (3).
(3)
Where Eq is the logic equivalence checking variable, T1 is denoted as the true specification one, T2 is the true specification two and r is the random value. A logic equivalence check is a very crucial process to perform after the synthesis process. This process checks the similarity between the original code and the synthesis netlist. To calculate the power by using the Eq. (4).
Power = Ps + PC + Pl, (4)
if (power ≤ 0.5 = optimal). (5)
Where Ps is denoted as the power switching, PC is denoted as the power short circuit and Pl is denoted as the power leakage, by using the Eq. (5) to check the power optimization. A power value of less than 0.5 means optimal. Otherwise, move to gate level netlist and remove the data path using the buffalo optimization for power optimization.
if (area ≤ 3.2 = optimal). (6)
To check the optimization of the area by using the Eq. (6). Where 3.2 is the standard area usage value in BRTL. The area value less than 3.2 means optimal. Otherwise, move to the gate level and, reduce the gate size and minimize the logic using the buffalo optimization for area optimization. The control unit generation consists of two main measures: The pre-synthesis steps of the control state allocation process mapped AMDL control structures to VHDL implementation. In this step, the hierarchical structure of FSM fragments is built. The second step reduces the number of clock cycles by finding and using any opportunities where multiple control states can be executed at the same time. The generator of the BRTL model creates the entire implementation of FSM from the optimized tree of control states.
Algorithm 1: BRTL design | |||
Start | |||
{ | |||
Initialization() | |||
{ | |||
int D (x) = 1, 2, …, n; | |||
// initialize the digital data | |||
} | |||
Functional verification() | |||
{ | |||
int F, Ts, Fs; | |||
// initialize the function verification variables | |||
Verify → D (x) – false specification | |||
// verify the correct specification | |||
if (specification = true) | |||
{ | |||
Go to next step | |||
}else go to previous step | |||
} | |||
Logic equivalence check() | |||
{ | |||
Power optimization() | |||
{ | |||
Power → Ps + PC + Pl | |||
// calculate the power | |||
if (power ≤ 0.5) | |||
{ | |||
Optimal | |||
}else go to gate level netlist | |||
// optimization of power | |||
} | |||
Area optimization() | |||
{ | |||
if (area ≤ 3.2%) | |||
{ | |||
Optimal | |||
}else go to gate level | |||
// area optimization | |||
} | |||
} | |||
} | |||
Stop |
The flow and step-by-step process of the novel buffalo-based register level transfer design for ASIC-based memcomputing can be described in Fig.
The novel BRTL design is validated in Matlab and executed in the Windows 10 platform. The buffalo-based register transfer level design was implemented to control the ASIC design's functional features. The presented Buffalo best solution helps to keep the MEMs architecture in an optimal condition by controlling the data overflow. First, initialize the binary data.
The performance of the proposed model can be validated by power consumption, area, delay and memory usage. The specification of the execution parameters is given in Table
Descriptions of the parameters | |
Programming environment | Matlab |
Dataset | Digital data |
Operating system | Windows 10 |
Optimization | Buffalo optimization |
Developing a new ASIC based memcomputing is the new way of computing that uses memory as a computational resource; implement the BRTL in the memcomputing. Memristor is the type of memory that can compute and store information, making them ideal for memcomputing applications. The combination of ASIC and memristor was successfully implemented in the memcomputing application using BRTL design.
The memcomputing has some arithmetic operations. The diagram shows how to perform arithmetic operations using a series of algorithms. The algorithms are stored in the memory of the memcomputing device, and they are executed one after the other to perform the desired operations.
The diagram shows how to perform arithmetic operations using a series of algorithms. The algorithms are stored in the memory of the memcomputing device, and they are executed one after the other to perform the desired operations. Figure
The memcomputing device has components like a register, algorithm and control unit. In memcomputing, the algorithms are stored in the memory of the device and are executed one after the other. The advantage of using it to perform arithmetic operations is that it is much faster than traditional methods. This is because the algorithms for the desired operations can be stored in memory and executed very quickly. Additionally, memcomputing devices are very energy efficient, so they can be used to perform arithmetic operations on battery-powered devices.
Figure
The presented model is designed using the BRTL implemented on Matlab and runs on the Windows 10 platform. The metrics such as Area, power consumption, memory usage and delay are computed to validate the performance of the proposed method. For validating the performance improvement, take the recently associated model. The existing models such as Spiking Neural Network (SNN) [
The power consumption of an electronic device can be calculated as the power supplied to the device minus the power lost by the device. The amount of energy that the circuit uses while it is operating is known as power consumption. It is an essential factor in the design of electronic systems because it can have a significant impact on the overall performance of the system and the life of the battery.
The power consumption can be calculated by using the Eq. (4).
The power consumption of the existing model, such as SNN earned 46 µW, HOERAA earned 28.54 µW, BNN earned 59.75 µW, and RNS-MLDA earned 44.11 µW. The proposed model earned a power consumption of 24.1548 µW, which is less than the existing models. The power consumption comparison with the current model is presented in Fig.
Delay is the time that it takes for the signal to travel from one point to another in a circuit. It is an important factor in the electronic system design because it has a significant impact on the overall performance of the system.
Delay = Dt + Dw. (7)
The delay can be calculated by using the Eq. (7). Where Dt is the transistor delay, which takes the time for a signal to travel through the transistor and Dw is the wire delay, which means it takes time for the signal to travel through a wire.
The delay of existing models such as the RNS-MLDA model earned 1.4 ns, the SNN model earned 0.98 ns, the HOERAA model earned 0.97 ns and the BNN model earned 0.97 ns. At the same time, the proposed model earned a delay of 0.812 ns. The comparison of delay with the existing model is presented in Fig.
The metrics area is the amount of space that the circuit occupies on a chip. It is an essential factor in the electronic system design because it could have a significant impact on the system cost.
The area usage of the existing model, such as the HOERAA method, earned 418.58 µm2, the SNN method earned 128 µm2, the RNS-MLDA method earned 54.32 µm2 and the BNN method earned 23.18 µm2. At the same time, the proposed model earned an area of 20.95 µm2, which is lower than the existing methods. The comparison of the area with the existing method is presented in Fig.
Memory is a circuit which can store the data permanently and temporarily. It is an important component in the electronic system design because it can store data for various purposes.
(8)
The memory usage can be calculated by the Eq. (8). Where the size of the data is DsNa the number of times the memory will be accessed and Ta the time taken for accessing the memory.
The memory usage of the existing model, such as SNN, earned 533 MB. The proposed model earned memory as 259 MB, which is less than the existing method. The comparison of memory usage with the existing method is presented in Figure
Overall, the presented method has obtained better metrics scores in all metrics: Deviation time, area, power, frequency and cells. The overall performance of the proposed BRTL model has been tabulated in Table
The overall performance of the proposed methodology is described in Table
Comparison statistics | ||||
Methods | Power (µW) | Delay (ns) | Area (µm2) | Memory (MB) |
SNN | 46 | 0.98 | 128 | 583 |
HOERAA | 28.54 | 0.97 | 418.58 | – |
BNN | 59.75 | 0.97 | 23.18 | – |
RNS-MLDA | 44.11 | 1.4 | 54.32 | – |
Proposed | 24.1548 | 0.812 | 20.95 | 259 |
Performance of BRTL | |
Parameters | Performance |
Lines of code | 100 |
Delay (ns) | 0.812 |
Area (µm2) | 20.95 |
Cells | 3983 |
Deviation time (s) | 1.21 |
Power (µW) | 24.1548 |
Frequency (MHz) | 28.2 |
Memory (MB) | 259 |
High-quality digital data processing RTL models in MEMS sensors are typically hard to develop and keep up-to-date, as they lead to optimal gate-level implementations. The design of BRTL and optimization techniques that are effective in reducing area and lowering power consumption can significantly improve the effort and design time. The paper proposes a new way to model generate BRTL hardware using the AMDL language. The AMDL provides a high-level algorithmic interface for BRTL designers, which can lead to faster and more efficient designs. The proposed pre-synthesis steps for translating the AMDL model to BRTL have been shown to produce more quality output, which leads to efficient subsequent synthesis steps. Initially, the memcomputing system was designed with the needed functional processing elements. Then, a novel Buffalo-based register-transfer level design (BRTLD) is modelled in the MEMS architecture. The digital data transmission process is performed to estimate the reliability of the proposed BRTL model. The proposed model earned power as 24.1548 when compared with the traditional model, improved by 2%. Finally, the chief metrics like delay, memory usage and power/energy utilization were measured and compared with other models. In future, design the high-speed and area-efficient model with the proposed model.