1 / 172

Andes SoC Development Solution Training Course ( For University )

Andes SoC Development Solution Training Course ( For University ). Outline. ANDES 自主研發處理器簡介 ADP-XC5FF76 Evaluation Board 介紹 AndeScore 指令集架構 AndeSight 整合開發環境操作介紹 嵌入式軟體程式設計原理 Hello World GPIO 控制原理 SUM 控制原理 MP3 ADP-XC5FF76 Evaluation Board Totally Labs. 使用 AndESLive 開發數位相框之參考設計.

howe
Télécharger la présentation

Andes SoC Development Solution Training Course ( For University )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Andes SoC Development Solution Training Course(For University)

  2. Outline • ANDES自主研發處理器簡介 • ADP-XC5FF76 Evaluation Board 介紹 • AndeScore指令集架構 • AndeSight整合開發環境操作介紹 • 嵌入式軟體程式設計原理 • Hello World • GPIO控制原理 • SUM控制原理 • MP3 • ADP-XC5FF76 Evaluation Board Totally Labs. • 使用AndESLive開發數位相框之參考設計 ANDES Confidential

  3. ANDES自主研發處理器簡介 ANDES Confidential

  4. Introduction • What are embedded systems? • Challenges in embedded system design. • Design methodologies. ANDES Confidential

  5. Embedded System ? • An embedded system is a special-purpose computer system designed to perform one or a few dedicated functions • with real-time computing constraints • include hardware, software and mechanical parts ANDES Confidential

  6. Embedding a computer ANDES Confidential

  7. Embedded Processor • Characteristics • Low power • Closed operating environment • Cost sensitive ANDES Confidential

  8. Embedded Processor (cont.) • How To Design A Good Embedded Processor ? • Understand how a digital system works • Understand the functional requirements of the applications • Know all the related design techniques, software and hardware • Select those features you want, and abandon those that you don’t need • Evaluate the design, goal: either the fastest possible for future expansion reasons, or slowest possible for cost reasons ANDES Confidential

  9. SOC (System On Chip) • Characteristics • A complete system manufactured on a single IC • Usually includes a processor, memories, peripherals and interfaces • May require mixed mode (digital and analog) semiconductor technology • Components are typically modulated and IP form ANDES Confidential

  10. SOC (System On Chip) (cont.) • Advantages • Cost • Power • Speed • Versatility with IP uses • Disadvantages • Availability of IPs • Compatibility of IPs • Verification/Testing issues • Packaging and heat dissipation ANDES Confidential

  11. SOC (System On Chip) (cont.) • Key design issues • Process technology • Process integration (logic, memory) • Mix-mode • Communications and interfaces • System architecture and integration • OS kernal • Low power • Real-time computing • Application domain knowledge ANDES Confidential

  12. Applications • Personal digital assistant (PDA). • Printer. • GPS • Cell phone. • Automobile: engine, brakes, etc. • Television. • Household appliances. ANDES Confidential

  13. BT Stereo HS Media Center PC BT Keyboard& Mouse Digital CableReady TV Media Player Cable STB DVD+PVR PC MP3Player Game System HDTV Camera Printer Media Phone IP Home Stereo 802.11Router Network Storage MobilePhone Notebook PC VoIP Phone @ MAC Embedded Systems Connect Your Life ConsumerElectronics Domain ` Internet PC Domain ANDES Confidential

  14. Embedded Systems Connect Your Life (cont.) ANDES Confidential

  15. Embedded Systems Connect Your Life (cont.) ANDES Confidential

  16. Characteristics of Embedded Systems • Sophisticated functionality. • Real-time operation. • Low manufacturing cost. • Low power. • Designed to tight deadlines by small teams. ANDES Confidential

  17. Design methodologies • A procedure for designing a system. • Understanding your methodology helps you ensure you didn’t skip anything. • Compilers, software engineering tools, computer-aided design (CAD) tools, etc., can be used to: ANDES Confidential

  18. Target SW Compiler Assembler/Linker Debugger Tool chains Andes Virtual Platform Your Virtual SoC Application Models Essential IP’s Models AndeScore Customer SoC Evaluation Board Application IPs Essential IPs AndeScore SoC Customer SoC Add AICE™, ADP-AG101™, and ADP™-XC5 in v1.3.1 SoC Development Flow Andes/Partners’ solution Customers’ Design SW SoC Definition HW High Level Modeling Logic Design ANDES Confidential

  19. Summary • Embedded computers are all around us. • Many systems have complex embedded hardware and software. • Embedded systems pose many design challenges: design time, deadlines, power, etc. • Design methodologies help us manage the design process. ANDES Confidential

  20. Overview of Andes Technology Andes Highlights • Founded in 2005 March • First tier investors and partners (Government VC, MediaTek, and Faraday) • USD$20M capital for financial stability Andes’ Mission • Provide the best processor-based SoC solution Market Opportunities • The demand of multi-standard and multi-functions for different applications due to the device convergence of consumer electronics • The BRICs demand a big volume for low cost products • Fast growing market in Asia, world-wide IC designs move to Asia ANDES Confidential

  21. Andes’ Main Lines of Business AndeStar™ Andes 16/32-bit Mixable ISA AndesCore™ CPU Core Family AndESLive™ ESL Integrated Virtual Environment Andes Embedded™ AndeShape™ SoC + EVB + ICE AndeSight™ Integrated Development Environment AndeSoft™ Optimized Target SW such as Linux/RTOS, Middleware, and Application Software. ANDES Confidential

  22. AndesCoreTM Market Segments • MID/Netbook • MFP • Networking • Gateway/Router • Home entertainment • Smartphone/Mobile phone High-end N12 series • Portable audio/media player • DVB/DMB baseband • DVD • DSC • Toys, Games Mid-range N10 Series • MCU • Storage • Automotive control • Toys Low-end N9 Series ANDES Confidential

  23. AndesCore™ – Configurable Options • Instruction extensions: • Audio extensions • Performance extensions • Floating co-processor • String processing acceleration • User-defined extensions • Debugging support: • Embedded Debug Module with HW breakpoints • Embedded Program Tracer • Embedded performance monitor • Core: • Big/little endian • Static/Dynamic branch prediction • BTB size: 32/64/128/256 entries • 2/3 nested interrupt levels • 16/32 GPRs • 2R1W/3R2W register file • Cache: • Instruction queue size: 2/4/8 • 8KB ~ 64KB, 1/2/4 ways • 16B/32B cache line size • Replacement policy: Pseudo LRU or random • Local Memory: • Internal or external, 4KB ~ 1MB • Memory Management • Simplest 2/4 partitions • MPU with 8 segments • MMU • microTLB size: 4/8 entries • mainTLB size: 32/64/128 entries • Page table walking: hardware or software • Bus interfaces: • AHB/AHB-Lite/APB/AMI • HSMP bus ANDES Confidential

  24. JTAG/EDM N9 uCore Instr LM/IF Instr Cache Data Cache Data LM/IF External Bus Interface APB/AHB/AHB-Lite/AMI N903: Low-power Cost-efficient Embedded Controller • Features: • Harvard architecture, 5-stage pipeline. • 16 general-purpose registers. • Static branch prediction • Fast MAC • Hardware divider • Fully clock gated pipeline • 2-level nested interrupt • External instruction/data local memory interface • Instruction/data cache • APB/AHB/AHB-Lite/AMI bus interface • Power management instructions • 45K ~ 110K gate count • 250MHz @ 130nm • Applications: • MCU • Storage • Automotive control • Toys ANDES Confidential

  25. N903 Competition *TSMC free library with max speed synthesis constraint ANDES Confidential

  26. N1033A: Lowe-power Cost-efficient Application Processor • Features: • Harvard architecture, 5-stage pipeline. • 32 general-purpose registers • Dynamic branch prediction • Fast MAC • Hardware divider • Audio acceleration instructions • Fully clock gated pipeline • 3-level nested interrupt • Instruction/Data local memory • Instruction/Data cache • DMA support for 1-D and 2-D transfer • AHB/AHB-Lite/APB bus • MMU/MPU • Power management instructions • Applications: • Portable audio/media player • DVB/DMB baseband • DVD • DSC • Toys, Games ANDES Confidential

  27. N1033A Competition *TSMC free library with max speed synthesis constraint ANDES Confidential

  28. JTAG/EDM EPT I/F N12 Execution Core ITLB DTLB MMU Instruction Cache Instruction LM Data LM Data Cache DMA External Bus Interface AHB HSMP N1213 – High Performance Application Processor • Features: • Harvard architecture, 8-stage pipeline. • 32 general-purpose registers • Dynamic branch prediction. • Multiply-add and multiply-subtract instructions. • Divide instructions. • Instruction/Data local memory. • Instruction/Data cache. • MMU • AHB or HSMP(AXI like) bus • Power management instructions • Applications: • Portable media player • MFP • Networking • Gateway/Router • Home entertainment • Smartphone/Mobile phone ANDES Confidential

  29. N1213 Competition *TSMC free library with max speed synthesis constraint ANDES Confidential

  30. Pipeline Overview ANDES Confidential

  31. Computer architecture taxonomy • von Neumann architecture ANDES Confidential

  32. Computer architecture taxonomy (cont.) • Harvard architecture address CPU data memory PC data address program memory data ANDES Confidential

  33. 8-stage pipeline ANDES Confidential

  34. Instruction Fetch Stage • F1 – Instruction Fetch First • Instruction Tag/Data Arrays • ITLB Address Translation • Branch Target Buffer Prediction • F2 – Instruction Fetch Second • Instruction Cache Hit Detection • Cache Way Selection • Instruction Alignment IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential

  35. Instruction Issue Stage • I1 – Instruction Issue First / Instruction Decode • 32/16-Bit Instruction Decode • Return Address Stack prediction • I2 – Instruction Issue Second / Register File Access • Instruction Issue Logic • Register File Access IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential

  36. Execution Stage • E1 – Instruction Execute First / Address Generation / MAC First • Data Access Address Generation • Multiply Operation (if MAC presents) • E2 –Instruction Execute Second / Data Access First / MAC Second / ALU Execute • ALU • Branch/Jump/Return Resolution • Data Tag/Data arrays • DTLB address translation • Accumulation Operation (if MAC presents) • E3 –Instruction Execute Third / Data Access Second • Data Cache Hit Detection • Cache Way Selection • Data Alignment IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential

  37. Write Back Stage • E4 –Instruction Execute Fourth / Write Back • Interruption Resolution • Instruction Retire • Register File Write Back IF1 IF2 ID RF AG DA1 DA2 WB EX MAC1 MAC2 ANDES Confidential

  38. Branch Prediction Overview • Why is branch prediction required? • A deep pipeline is required for high speed • Increasing the number of stages between fetch and branch resolution increases the taken-branch penalty • Prediction allows the penalty to be avoided in the majority of cases • Why dynamic branch prediction? • Static branch prediction requires knowledge of the type of branch and the target address before a prediction can be made • This information is not available before the decode stage and this would still increase the penalty for all branches • Dynamic branch prediction is performed at the instruction fetch stage based purely on fetch addresses – no knowledge of the incoming instructions is required ANDES Confidential

  39. Branch Prediction Unit • Branch Target Buffer (BTB) • 128 entries of 2-bit saturating counters • Strongly-taken, Weakly-taken, Weakly-not-taken, Strongly-not-taken • 128 entries, 32-bit predicted PC and 26-bit address tag • Call-return and alignment flags • Return Address Stack (RAS) • Four entries • BTB and RAS updated by committing branches/jumps ANDES Confidential

  40. BTB Instruction Prediction • BTB predictions are performed based on the previous PC instead of the actual instruction decoding information, BTB may make the following two mistakes • Wrongly predicts the non-branch/jump instructions as branch/jump instructions • Wrongly predicts the instruction boundary (32-bit -> 16-bit) • If these cases are detected, IFU will trigger a BTB instruction misprediction in the I1 stage and re-start the program sequence from the recovered PC. There will be a 2-cycle penalty introduced here ANDES Confidential

  41. RAS Prediction • When return instructions present in the instruction sequence, RAS predictions are performed and the fetch sequence is changed to the predicted PC. • Since the RAS prediction is performed in the I1 stage. There will be a 2-cycle penalty in the case of return instructions since the sequential fetches in between will not be used. ANDES Confidential

  42. Branch Miss-Prediction • In N12 processor core, the resolution of the branch/return instructions is performed by the ALU in the E2 stage and will be used by the IFU in the next (F1) stage. In this case, the misprediction penalty will be 5 cycles. ANDES Confidential

  43. Cache ANDES Confidential

  44. Cache and CPU address data cache main memory CPU cache controller address data data ANDES Confidential

  45. Cache operation • Many main memory locations are mapped onto one cache entry. • May have caches for: • instructions; • data; • data + instructions (unified). ANDES Confidential

  46. Multiple levels of cache L2 cache CPU L1 cache ANDES Confidential

  47. Replacement policy • Replacement policy: strategy for choosing which cache entry to throw out to make room for a new memory location. • Two popular strategies: • Random. • Least-recently used (LRU). ANDES Confidential

  48. Write operations • Write-through: immediately copy write to main memory. • Write-back: write to main memory only when location is removed from cache. ANDES Confidential

  49. Improving Cache Performance • Goal: reduce the Average Memory Access Time (AMAT) • AMAT = Hit Time + Miss Rate * Miss Penalty • Approaches • Reduce Hit Time • Reduce or Miss Penalty • Reduce Miss Rate • Notes • There may be conflicting goals • Keep track of clock cycle time, area, and power consumption ANDES Confidential

  50. Tuning Cache Parameters • Size: • Must be large enough to fit working set (temporal locality) • If too big, then hit time degrades • Associativity • Need large to avoid conflicts, but 4-8 way is as good a FA • If too big, then hit time degrades • Block • Need large to exploit spatial locality & reduce tag overhead • If too large, few blocks ⇒ higher misses & miss penalty Configurable architecture allows designers to makethe best performance/cost trade-offs ANDES Confidential

More Related