Using Machine Learning In EDA

This approach can make designs better and less expensive, but it will require a huge amount of work and more sharing of data.

popularity

Machine learning is beginning to have an impact on the EDA tools business, cutting the cost of designs by allowing tools to suggest solutions to common problems that would take design teams weeks or even months to work through.

This reduces the cost of designs. It also potentially expands the market for EDA tools, opening the door to even new design starts and more chips from more companies.

“There’s a huge elasticity of demand, and almost any innovation that is seen as making the tool run faster or taking less resource to run the tool almost always immediately turns into somebody seeing how they can run more in parallel or take on bigger tasks, and do bigger chips,” said Chris Rowen, CEO of Cognite Ventures.

Efficiency increases usually translates into engineering teams being able to do other work, and diversification of the work increases, observed Ting Ku, senior director of engineering at Nvidia. “When we couple up, we don’t do the same things. We actually do something else that we didn’t have time to do before, as in, ‘Now I have free time because efficiencies improve so I get to do this now.’”

This should be good news for EDA tool providers. “We used to do five jobs, now I’m doing 10 distinct jobs and I need 10 tools, so where is the growth for EDA? I have to find more things,” said Ku. “You need test, and you need EDA’s help to do it. That is the growth area.”

It might seem as if solving problems would reduce the number of tools being sold. Exactly the opposite is happening.

“An example of this phenomenon in the test space was when we figured out how to speed up stuck-at-fault,” said Harry Foster, verification chief scientist at Mentor, a Siemens Business. “We realized we now had time to do timing pass or something else that we didn’t do before because we didn’t have time.”

Just getting started

While the promise is huge, the augmentation of EDA tools with machine learning is only just beginning.

“One problem we’ve been thinking on is that we indeed help customers a lot, but how are we going to charge more,” asks Norman Chang, chief technologist for the semiconductor business unit at ANSYS. “When it comes to the machine learning enhancement capabilities on the existing tools, customers probably will not want to pay extra money, and it’s very difficult to make an independent tool out of nothing. So what is the model that will work with customers, and what are customers willing to give for EDA tools? That point is not clear.”

And while it might be possible to layer machine learning enhancements on top of existing tool architectures, Andrew Kahng, professor at the University of California, San Diego said when it comes to applying machine learning to EDA tools, the distinction between ‘in’ and ‘around’ is very important: the ‘around’ is where a huge amount of value can be harvested, and people will initially follow the money and the ROI there.

This path would likely see EDA and its customers adopt machine learning techniques more quickly.

Kahng pointed out that a recently published paper with Qualcomm about the company’s design center in Bangalore examined resource management and schedule optimization, e.g., “when a late RTL bug hits, do you have to buy new servers to manage your 20+ ongoing projects with acceptable schedule risk, or can you just reshuffle existing resources? The results are very striking. When potential benefits are on the order of millions of dollars or weeks of schedule, anyone can see the value. So, those built ‘around’ optimizations will come first.”

“And maybe there’s a future of EDA where instead of having the usual EDA tools ‘with’ machine learning layered on top, we will see new generations of EDA tools architected ‘for’ machine learning that are inherently more predictable. That would be a new kind of EDA. It’s actually core EDA — routers, optimizers, chip planners, and so on — but with greater stability and less chaos, for example,” he suggested.

To Rowen, this is still about improving more or less ‘classical’ flows using machine learning. He said the ‘around’ approach utilizes heterogeneous integration of multiple tools. “You’re partly addressing the data problem, because today if you’re getting analog simulation and verification coverage, and you’re getting place and route congestion data, there isn’t any holistic way of looking at all of that. But some of these machine-learning techniques, especially deep learning, are extremely effective at integrating across multiple data types and figuring out what to consider from each of the types of input.”

And as long as it can be related to some outcome that is predetermined, then all of those kinds of data can be thrown into the hopper so that the system figures out what is most important in determining yield. “Was it how much timing margin I had, or how much was it physical design rules that were affecting my yield over here,” said Rowen. “It gets to wrap around because it gets to sit on top of multiple tools and pull data. Years ago we went through the whole database thing. This is now kind of a data aggregation/data exploitation, where it isn’t just whether I can read the output of some other tool. When I take all of the statistical results from all of these tools, what’s the big picture that it’s painting? Machine learning is uniquely positioned to be able to start giving some insight.”

Ultimately, any addition has to improve the time to results or quality of results, noted Sashi Obilisetty, director of R&D at Synopsys. “Without that, technology is just technology — it has to enable customers for better optimization, and maybe deep learning is the one that will bring all of these correlations together. It’s a matter of experimenting, collaborating, and then coming up with a solution that works. In EDA we work so collaboratively with our customers, it’s going to be very important to continue to do that, and not lose sight of the value that we are giving.”

Ku agrees. “A company like ANSYS or Synopsys that owns multiple different domains of tools has been working hard trying to solve concurrent engineering or solving problems simultaneously. But in reality, that fit is actually quite difficult. When you are crossing domains, it’s a really difficult problem. Machine learning techniques actually merge these cross-domain problems quite nicely, and it’s a unique trait of this strategy that hasn’t been invented yet.”

GPUs in EDA?
Another area for consideration is the possibility of running machine learning-enhanced EDA tools on GPUs. While GPUs are the go-to platform of choice for training machine-learning systems, it’s unknown if that will work for EDA problems.

Ku said this is his hope. “When I talk to EDA vendors, most of the time I tell them in the beginning they’d be doing smaller problems — just augmenting the existing flows. These are small problems, and they can be solved with CPUs in the traditional way because as soon as you ask people to buy a GPU farm they say, ‘No, thank you. We’re not doing that. Except Nvidia — we already have a GPU farm. You can work with us.’ But the hope is, at some point, the problem will reach a particular inflection point that it will be so difficult, so complex, and the variables so high-dimensional that you have to use GPUs to solve your problem. That day is not here yet, but I believe it will come as the growth trajectory suggests.”

Obilisetty noted that Synopsys has some efforts underway to run classic tools on GPUs, but it needs significant re-architecting. This is not a trivial amount of work.

“Anything can be done,” she said. “It takes work. In the same way, introducing machine-learning concepts and machine-learning techniques into our tools needs a rethinking. It’s never easy, and we don’t want to compromise the quality of results but we still want to be smart about doing less simulations, less analysis so I don’t think one is easier than the other but if we ever get to the stage using more ML, more deep learning, there’s a lot of hardware that can really help with quick turnarounds. We still have to get there.”

Nevertheless, classic tools like SPICE simulation don’t fit that well on a highly parallel, floating point architecture like a GPU. “At best you might be able to get there with restructuring,” said Rowen. “Machine learning and deep learning certainly are parallelizable, so they will run well on parallel architectures like GPUs. The two unknowns are what class of techniques are people using? Is it relatively simple statistical analysis from the bag of machine learning tricks, or is it full deep learning?”

Machine learning often is not that computationally intensive, so it’s really more about figuring out how to use the right model. “Deep learning, for sure, is computationally intensive, especially in training, but also in inference,” Rowen said. “Just applying it is fairly computationally expensive. In the first case, where it’s not very computationally intensive, the CPU is probably adequate. When you get into inference, it then becomes a question of how significant it is. The world is starting to fill up with people who are building more specialized, parallel inference engines for image processing. We might get to the point where doing that computation is so big, that you go there. That seems less likely. This seems like a problem that’s going to be, while compute-intensive, a server-based thing where staying on mainstream hardware is going to be good enough — maybe too good. But I don’t think we’re likely to find that the existing kinds of training and inference processors are inadequate to do what we do in EDA. I think we’re years away from exceeding the capacity.”

However, Kahng insists ML could be closer than most think. “Watching competitions on Kaggle, or the progress of AlexNet, SqueezeNet, etc., you see tremendous commoditization and availability of high-quality, open-source stacks. Think about the 90/10 rule. We have a canonical bag of tricks developed by the machine learning field over these past two decades, plus a lot of low-hanging IC design applications where one can get to that 90% very easily. We can predict detailed route DRCs before launching an expensive run. Or we can predict the error in slew calculations or miscorrelations between two golden analysis tools. Or missing corners in a multi-corner analysis: run your timer for 15 corners and then predict endpoint slacks at 50 other corners. These are incredibly high-value applications for ML. And once you have a model trained on, say, a particular 16nm FinFET enablement – library, back-end stack, tool chain – that model is good forever.”

Even if there are thousands of parameters in the model, he believes the return on investment for ML model development is compelling, and will continue as long as the technology is offered.

Gathering the data brings a small data problem, Kahng added, noting that designs, design enablements, and design targets are always changing and “different.” “I’m sure people can be creative about generating data to make big data when nominally there would be only small data, but they’ll be watching the benefits and payoffs pretty carefully.”

Connected to this is the fact that there has yet to be an infrastructure to really share data in between companies, and this is needed to eliminate the data problem.

“A small data problem perhaps requires a data assimilation process, where the proprietary information is kept, and then companies are willing to freely share their information,” said Ku. “Today we have Facebook and Google, and they collect information about all of us, and we are okay with that because it’s not really from us. They don’t know I did something. Perhaps some sort of data abstract is necessary to solve that small data problem where all companies can share their data.”

Another place where there is an opportunity to collect a lot of data is in the foundry. “The ideal thing is you are getting anonymized data from manufacturing along with anonymized data from the tool flow, and then you really can build these big, correlated models,” said Rowen. “The industry would benefit, but we may have a ‘tragedy of the commons’ situation where nobody wants to participate because they are afraid.”

At the end of the day, Kahng noted, for ML to be applied ‘around’ EDA in the short term, and then go the rest of the way to ML ‘in’ EDA, the thinking around data collection must fundamentally change.

“How can small companies start to leverage what big companies have collected? That’s an industry-wide infrastructure — including anonymization and normalization and obfuscation standards — that is being alluded to. Leveraging FlexLM and Splunk and stuffing all the reports and logs into a data warehouse — this was the vision of METRICS in 1999, long before its time. Back then, Dr. Stefanus Mantik used Oracle 8i, and he wrote his own wrappers for Synopsys and Cadence SP&R tools. Today, data pipes and analytics are so well commoditized that we should see customers at a tipping point pretty soon: they’ll bite that bullet, collect the design flow data, and then I think it’s open season,” he added.

One thing is for sure today—there is no shortcut. “I started using these techniques almost two years ago, and no one believed me in the beginning,” said Ku. “I’m in charge of the internal tool development, so I have a captive audience — they have to use my tools, and they didn’t believe me, but somehow I made it happen and they started to believe me. Then I went out to the EDA vendors and told them I had this way to solve problems and make the tools better. Guess what happened? No one believed me again. One day, maybe a year ago, I showed ANSYS the internal tool that my team developed for internal use and realized that the only time people started to believe is when I showed them. That applies to internal customers, it applies to the EDA world, it will apply to the rest of the world — that’s the way human nature works.”

Related Stories
Machine Learning Meets IC Design
There are multiple layers in which machine learning can help with the creation of semiconductors, but getting there is not as simple as for other application areas.
CCIX Enables Machine Learning
The mundane aspects of a system can make or break a solution, and interfaces often define what is possible.
Machine Learning Popularity Grows
After two decades of experimentation, the semiconductor industry is scrambling to embrace this approach.



Original text