# EE 445 Computer Architecture I Fall 2025 # Administrative Details-Instructors - Section 1: Ece Güran Schmidt - Email: <u>eguran@metu.edu.tr</u> - Web page: <a href="http://users.metu.edu.tr/eguran/">http://users.metu.edu.tr/eguran/</a> - Office: A402 - Section 2: Ahmed Hareedy - Email: <u>ahareedy@metu.edu.tr</u> - Web page: <a href="http://users.metu.edu.tr/ahareedy/">http://users.metu.edu.tr/ahareedy/</a> - Office: D124-2 - You can check out our web pages to see what we do in our research # **Administrative Details** #### Schedule - Section 1: Tuesday 14:40-15:30 Thursday 10:40-12:30, @ A209 - Section 2: Tuesday15:40-17:30 Thursday 16:40-17:30, @ A206 #### Course Conduct In-class lectures and problem solving sessions. Video lectures of the previous years (subject to change) and lecture notes are posted in odtuclass. #### Required - Prerequisite(s): EE348 - Core course for Computer Option # Grading - 4 Short Exams: 52% (13% each) - Final exam: 36% - HDL Homeworks: 12% - 5% bonus for attendance >=80% # Schedule Week 1 INTRO-EE 348 Review > Week 1-3 ASM-RTL Week 4 HDL Week 5-7 Basic Computer Week 8-9 Microprogramming Week 10-11 Arithmetic Processor > Week 12-14 ARM ISA ## Administrative Details #### Follow – <u>https://odtuclass.metu.edu.tr/</u> for lecture slides, all class material, recorded lecture videos and announcements Your <u>e123456@metu.edu.tr</u> email #### Communication - Preferred communication mean: E-MAIL - Send with subject including EE445 (no guarantee of reply otherwise) Copyright notice: Lecture Note Slides are compiled from the teaching material of these books, previous lecture notes of EE445 and additional resources. Part of the slides are entirely created by the instructors. ## **Text Books** Digital Design (4th Edition) M. Morris Mano, Michael D. Ciletti Published by Prentice Hall, 2006 Computer System Architecture 3rd Ed., M. Morris Mano Prentice Hall, 1992 Computer System Architecture 2nd Ed., M. Morris Mano Prentice Hall, 1982 Harris & Harris, "Digital Design and Computer Architecture. ARM Edition", 1st Ed., Kaufmann, 2015. We go on with this text book for EE446 # Course Objective (Why should you take this course?) - A smooth extension of EE348 - Describes how a computer works at EE348 level of detail on a simple fictitious Basic Computer - Preparation for the advanced topics: pipelining, memory and I/O organization that are covered in EE446 # Course Outline - Introduction to Computer Architecture - EE348 review - Algorithmic State Machine - Register Transfer Language - HDL - Basic Computer Architecture - Computer Organization and Microprogramming - Arithmetic Processor Design - ARM Instruction Set Architecture # Introduction to Computer Architecture #### Resources: http://www.csl.cornell.edu/courses/ece4750 https://safari.ethz.ch/digitaltechnik/spring2020/doku.php Computer Architecture A Quantitative Approach, Sixth Edition - abstraction/implementation layers - to execute information processing applications - efficiently using available manufacturing technologies https://www.csl.cornell.edu/courses/ece475 0/handouts/ece4750\_overview.pdf EE445 2025 Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Somputer Architecture https://www.csl.cornell.edu/courses/ece4750/handouts/ece4750\_overview.pdf #### Sort an array of numbers $2,6,3,8,4,5 \rightarrow 2,3,4,5,6,8$ #### Out-of-place selection sort algorithm - 1. Find minimum number in array - 2. Move minimum number into output array - 3. Repeat steps 1 and 2 until finished #### C implementation of selection sort ``` void sort( int b[], int a[], int n ) { for ( int idx, k = 0; k < n; k++ ) { int min = 100; for ( int i = 0; i < n; i++ ) { if ( a[i] < min ) { min = a[i]; idx = i; } b[k] = min; a[idx] = 100; } }</pre> ``` Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology Mac OS X, Windows, Linux Handles low-level hardware management Application Algorithm Computer Architecture Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology - Instruction Set Architecture (ISA): - Structure and behavior of the computer as seen by the programmer - There can be many implementations of the same ISA #### MIPS32 Instruction Set Instructions that machine executes ``` blez $a2, done move $a7, $zero 1i $t4, 99 move $a4, $a1 $v1, $zero move li $a3, 99 $a5, 0($a4) addiu $a4, $a4, 4 slt $a6, $a5, $a3 $v0, $v1, $a6 movn addiu $v1, $v1, 1 $a3, $a5, $a6 movn ``` # Instruction Set Architecture (ISA) #### Represents - all the information necessary to write a machine language program that will run correctly on the machine - the conceptual structure and functional behavior - Abstracts away - the organization of the data flows and controls - the logic design - the physical implementation. - Enables implementations of varying cost and performance to run identical software - Includes - Addressing modes - Operand specifications - Operation specifications - Control flow instructions ### Microarchitecture Microarchitecture/Organization: The specific arrangement of registers, ALUs, finite state machines (FSMs), memories, and other logic building blocks needed to implement an ISA. Example: AMD Opteron and the Intel Core i7 implement the 80x86 instruction set with very different pipeline and cache organizations Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology How data flows through system Boolean logic gates and functions Combining devices to do useful work Transistors and wires Silicon process technology # EE445 Coverage Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Devices Technology EE445 focuses on these layers using a fictitious Basic Computer #### **Instruction Set** | | Hex Code | | | |----------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Symbol | 1 = 0 | I = 1 | Description | | AND<br>ADD<br>LDA<br>STA<br>BUN<br>BSA<br>ISZ | 0xxx<br>1xxx<br>2xxx<br>3xxx<br>4xxx<br>5xxx<br>6xxx | 8xxx<br>9xxx<br>Axxx<br>Bxxx<br>Cxxx<br>Dxxx<br>Exxx | AND memory word to AC Add memory word to AC Load AC from memory Store content of AC into memory Branch unconditionally Branch and save return address Increment and skip if zero | | CLA<br>CLE<br>CMA<br>CME<br>CIR<br>CIL<br>INC<br>SPA<br>SXA<br>SZA<br>SZE<br>HLT | 7800<br>7400<br>7200<br>7100<br>7080<br>7040<br>7020<br>7010<br>7008<br>7004<br>7004<br>7002<br>7001 | | Clear AC Clear E Complement AC Complement E Circulate right AC and E Circulate left AC and E Increment AC Skip next instr. if AC is positive Skip next instr. if AC is negative Skip next instr. if AC is zero Skip next instr. if AC is zero Halt computer | | INP<br>OUT<br>SKI<br>SKO<br>ION<br>IOF | F800<br>F400<br>F200<br>F100<br>F080<br>F040 | | Input character to AC Output character from AC Skip on input flag Skip on output flag Interrupt on Interrupt of | # Memory unit 4096 x 16 AR LD INR CLR DR LD INR CLR DR LD INR CLR INPR IR OUTR LD INR CLR Clock 16-bit common bus #### Controller # EE445 Coverage ALU implementation, hardware algorithms for multiplication and division → Needs EE348 refresher ☺ #### Hardwired control Control signals are circuit outputs #### **Microprogrammed Control** Control signals are the control memory word contents # A Sneak Peek into EE446 © A Pipelined Microarchitecture LDR Rd, [Rn, imm12] for a STR Rd, [Rn, imm12] Instruction Set ADD Rd, Rn, imm8 representative B BTA subset of ARM ISA 9/23/2025 # A Sneak Peek into EE446 © #### Memory Hierarchy - Cost per byte decreases - Average access time increases - Average data transfer rate decreases - Total memory size increases - Frequency of access decreases → Principle of locality - Data contained in a lower level are a superset of the next higher level → Inclusion property # Computer Architecture in METU EE # Computers/Computer Architecture in METU EE EE 447: Microprocessors: I/O device interfacing Many computing Devices # Classes of Computers - Personal Mobile Device (PMD) - e.g. smart phones, tablet computers - Emphasis on energy efficiency and real-time - Desktop Computing - Emphasis on price-performance - Servers - Emphasis on availability, scalability, throughput - Clusters / Warehouse Scale Computers - Used for "Software as a Service (SaaS)" - Emphasis on availability and price-performance - Sub-class: Supercomputers, emphasis: floating-point performance and fast internal networks - Internet of Things/Embedded Computers - Emphasis: price # Trends and Performance ## Trends and Performance - Integrated circuit technology (Moore's Law) - Transistor density: 35%/year - Die size: 10-20%/year - Integration overall: 40-55%/year - DRAM capacity: 25-40%/year (slowing) - 8 Gb (2014), 16 Gb (2019), possibly no 32 Gb - Flash capacity: 50-60%/year - 8-10X cheaper/bit than DRAM - Magnetic disk capacity: recently slowed to 20%/year - Density increases may no longer be possible (TDMR is an exception), maybe increase from 7 to 9 platters - 8-10X cheaper/bit then Flash - 200-300X cheaper/bit than DRAM # Performance - Legacy view of computer architecture: - Instruction Set Architecture (ISA) design - i.e. decisions regarding: - registers, memory addressing, addressing modes, instruction operands, available operations, control flow instructions, instruction encoding - Now computer architecture also focuses on: - Specific requirements of the target machine - Design to maximize performance within constraints: cost, power, and availability - ISA, microarchitecture, hardware # Performance Metrics - Typical performance metrics: - Response time: Time between start and completion of an event - Throughput: Total work done in a given time - Speedup of X relative to Y - Execution time<sub>Y</sub> / Execution time<sub>X</sub> - Execution time - Wall clock time: includes all system overheads - CPU time: only computation time - Benchmarks - Kernels (e.g. matrix multiply) - Toy programs (e.g. sorting) - Synthetic benchmarks (e.g. Dhrystone) - Benchmark suites (e.g. SPEC06fp, TPC-C) # Introduction to Computer Architecture