Research in architecture of computer systems has always been a centralpreoccupation of the computer science and computer engineering communities.The investigation goals vary according to the target applications, the priceof the final equipment, the programmability of the system, the environment inwhich processors will be deployed and many others.For processors to be used in parallel machines for high-performance computingas it is the case in weather simulation, the focus is placed on high clockrates, parallelism and high communication bandwidth at the expense of power.In many embedded systems, the price of the final equipment is the governingfactor during the development. A small microcontroller is usually used to controldata acquisition from sensors and provide data to actuators at a very lowfrequency. In many other embedded systems, in particular in untethered systems,power and cost optimization are the central goals. In those systems, thegrowing need of more computation power that contradict with power and costoptimization put a lot of pressure on engineers who must find a good balanceof all contradicting goals. For an autonomous cart used to explore a given environment,the processing unit must be able to capture images, compress theimages and send the compressed images to a base station for control. Parallelto this, the system must perform other actions such as obstacle detection andavoidance. In such a system, power must be optimized to allow the system torun as long as possible. On the other hand, the processor must process imageframes as fast as possible, to avoid important frames to be missed. Obstacledetection and avoidance must also be done as faster as possible to avoid a possiblecrash of the cart. The multiplicity of goals has led to the developmentof several processing architectures, each optimized according to a given goal.Those architectures can be categorized in three main groups according to theirdegree of flexibility: the general purpose computing group that is based on theVon Neumann (VN) computing paradigm; domain-specific processors, tailoredfor a class of applications having in common a great range of characteristics;application-specific processors tailored for only one application.