[SOLVED] CS代考 CS152: Computer Architecture and Engineering

30 $

CS152: Computer Architecture and Engineering

Computer Architecture
Unit 1: Introduction

Copyright By PowCoder代写加微信 powcoder

Slides developed by,, C.J. Taylor, &
at the University of Pennsylvania
with sources that included University of Wisconsin slides
by,,, and.

Today’s Agenda
Course overview and administrivia

What is computer architecture anyway?
…and the forces that drive it

Course Overview

CIS 371/501 vs CIS 240
In CIS 240 you learned how a processor worked, in CIS 371/501 we will tell you how to make it work well.

CIS 371/501

CIS 240 is a hard pre-req. If you’ve taken CIS 371, don’t take this class.

CIS 240 vs. CIS 371/501
Focus on one toy ISA: LC4
Focus on functionality: “just get something that works”
Instructive, learn to crawl before you can walk
Not representative of real machines: 240 hardware is circa 1975

CIS 371/501
Less emphasis on any particular ISA during lectures
Focus on quantitative aspects: performance, cost, power, etc.
Representative of ~1980s hardware
also modern low-power processors, e.g., inside a Fitbit

Pervasive Idea: Abstraction and Layering
Abstraction: only way of dealing with complex systems
Divide world into objects, each with an…
Interface: knobs, behaviors, knobs  behaviors
Implementation: “black box” (ignorance+apathy)
Only specialists deal with implementation, rest of us with interface
Example: only mechanics know how cars work
Layering: abstraction discipline makes life even simpler
Divide objects in system into layers, layer n objects…
Implemented using interfaces of layer n – 1
(mostly) Don’t need to know interfaces of layer n – 2
Inertia: a dark side of layering
Layer interfaces become entrenched over time (“standards”)
Very difficult to change even if benefit is clear
Opacity: hard to reason about performance across layers

Feb 17, 2009: digital TV conversion date, was postponed to June 12th, even after significant government subsidy

Abstraction, Layering, and Computers
Computers are complex, built in layers
Several software layers: assembler, compiler, OS, applications
Instruction set architecture (ISA)
Several hardware layers: transistors, gates, CPU/Memory/IO
99% of users don’t know hardware layers implementation
90% of users don’t know implementation of any layer
That’s okay, world still works just fine
But sometimes it is helpful to understand what’s “under the hood”

System software
Transistors

Beyond CIS 371/501
CIS 380: Operating Systems
A closer look at system level software
CIS 371/501: Computer Organization and Design
A closer look at hardware layers
ESE 370: Circuit-Level Modeling, Design, and Optimization for Digital Systems
Diving into gate-level abstractions
ESE 532: System-on-Chip Design
HW+SW: design an application-specific hardware accelerator

System software

most CIS courses
CIS 371/501

Why Study Hardware?
Understand where computers are going
Future capabilities drive the (computing) world
Real world-impact: no computer architecture  no computers!
Understand high-level design concepts
The best system designers understand all the levels
Hardware, compiler, operating system, applications
Understand computer performance
Writing well-tuned (fast) software requires knowledge of hardware
Write better software
The best software designers also understand hardware
Understand the underlying hardware and its limitations
Design hardware
Intel, AMD, IBM, ARM, Qualcomm, Apple, Oracle, NVIDIA, Samsung, …

Some of you will actually be designing chips and systems and working with things at this level

All of you will be writing code and if you know how the system works you will be able to take better advantage of things that the system does well and avoid things that the computer does poorly.
This understanding can have profound effects of code performance.

Penn Legacy
ENIAC: electronic numerical integrator and calculator
First operational general-purpose stored-program computer
Designed and built here by Eckert and Mauchly
See it in Moore 100!

First seminars on computer design
Moore School Lectures, 1946
“Theory and Techniques
for Design of Electronic
Digital Computers”

Administrivia

Course Staff
Instructor
Levine 572
Alexander Do
Aliza Gindi
Eric Giovannini
Brandon Park
Shreyas Shivakumar

Important Dates
(see Canvas)

PhD students: WPE-1 exam
starting this semester, CIS 501 is a “course work” WPE-1 course
must obtain a sufficiently high course grade
you must declare your WPE1 status with Britton by next Friday
This year only, CIS 501 also offers the classic exam-only WPE1 option

The Verilog Labs
“Build your own processor” (pipelined 16-bit CPU for LC4)
Use Verilog HDL (hardware description language)
Programming language compiles to gates/wires not insns
Implement and test on real hardware
FPGA (field-programmable gate array)
Instructive: learn by doing
Satisfying: “look, I built my own processor”

Lab 5 Demo

Lab Logistics
Xilinx Vivado hardware compiler
Run it from biglab.seas.upenn.edu
ZedBoard FPGA boards
Live in Towne lockers, details coming
Most labs have a demo component that runs on the ZedBoard
Development and simulation can be done before final testing on the board

Coursework (1 of 2)
Labs – Labs 2-5 done in groups of two
Lab 1: Verilog debugging
Lab 2: arithmetic unit
Lab 3: single-cycle LC4 & register file
Lab 4: pipelined LC4
Lab 5: pipelined +superscalar LC4
Labs are cumulative and increasingly complex
Each lab broken down into “milestone” deadlines
Roughly one per week

Coursework (2 of 2)
In-class midterm (see Canvas)
Cumulative final exam (time & date set by registrar)
Class participation
A good way to earn some extra calories
We will not use clickers
See the participation section of the policies page

Course Resources
Course web site
Everything is at http://www.cis.upenn.edu/~cis501 or on Canvas (syllabus, lectures, homework, submission, grades, etc.)
“Campuswire”: the (new?)-up link on the course web site
The way to ask questions/clarifications
Can post to just me & TAs or anonymous to class
As a general rule, don’t email us directly
Sign-up required!
P+H, Computer Organization and Design, 4th or 5th edition
Reese & Thornton, Intro to Logic Synthesis using Verilog HDL
Both available free online! See course homepage for links

In many ways this is a class about debugging

Debugging Rules!

Tentative grade contributions:
Midterm: 20%
Final: 25%
Historical grade distribution: median grade is B+
No guarantee this semester will be similar, but the distribution seems reasonable

Homework and Late Days
Assignments usually due on Mondays at 11:59pm.Deadline is enforced by Canvas.
Submit as often as you like; your last submission is what counts.
Any assignment can be submitted up to 48 hours late, for 75% credit
No need to give an excuse, just turn it in late
Assignments are cumulative – you have to get things to work!

Academic Misconduct
Cheating will not be tolerated

General rule:
Anything with your name on it must be YOUR OWN work
You MUST scrupulously credit all sources of help
Example: individual work on homework assignments

See the course policies

Penn’s Code of Conduct
http://www.vpul.upenn.edu/osl/acadint.html

What is Computer Architecture?

Computer Architecture
Computer architecture
Definition of ISA to facilitate implementation of software layers
The hardware/software interface

Computer micro-architecture
Design processor, memory, I/O to implement ISA
Efficiently implementing the interface

CIS 371/501 is mostly about processor micro-architecture
“architecture” is also a vacuous term for “the design of things”
software architect, network architecture, …

Application Specific Designs
This class is about general-purpose CPUs
Processor that can do anything, run a full OS, etc.
E.g., Intel Atom/Core/Xeon, AMD Ryzen/EPYC, ARM M/A series

In contrast to application-specific chips
Or ASICs (Application specific integrated circuits)
Also application-domain specific processors
Implement critical domain-specific functionality in hardware
Examples: video encoding, 3D graphics, machine learning
General rules
Hardware is less flexible than software
Hardware more effective (speed, power, cost) than software
Domain specific more “parallel” than general purpose
But mainstream processors are quite parallel as well

Technology Trends

“Technology”
Basic element
Solid-state transistor (i.e., electrical switch)
Building block of integrated circuits (ICs)

What’s so great about ICs? Everything
High performance, high reliability, low cost, low power
Lever of mass production

Several kinds of integrated circuit families
SRAM/logic: optimized for speed (used for processors)
DRAM: optimized for density, cost, power (used for memory)
Flash: optimized for density, cost (used for storage)
Increasing opportunities for integrating multiple technologies

Non-transistor storage and inter-connection technologies
Magnetic disks, optical storage, ethernet, fiber optics, wireless

Moore’s Law – 1965

233 transistors

Moore’s Law today

data c/o WikiChip

gray line is Moore’s Law, doubling density every ~2.5 years
TSMC 7nm was used for12 processor, in iPhone XS/XR

Moore’s Law today

data c/o WikiChip

gray line is Moore’s Law, doubling density every ~2.5 years
TSMC 7nm was used for12 processor, in iPhone XS/XR

Revolution I: The Microprocessor
Microprocessor revolution
One significant technology threshold was crossed in 1970s
Enough transistors (~25K) to put a 16-bit processor on one chip
Huge performance advantages: fewer slow chip-crossings
Even bigger cost advantages: one “stamped-out” component

Microprocessors have allowed new market segments
Desktops, CD/DVD players, laptops, game consoles, set-top boxes, mobile phones, digital camera, mp3 players, GPS, automotive

And replaced incumbents in existing segments
Microprocessor-based system replaced supercomputers, “mainframes”, “minicomputers”, “desktops”, etc.

First Microprocessor
Intel 4004 (1971)
Application: calculators
Technology: 10,000 nm

2300 transistors

4-bit data
Single-cycle datapath

Revolution II: Implicit Parallelism
Then to extract implicit instruction-level parallelism
Hardware provides parallel resources, figures out how to use them
Software is oblivious

Initially using pipelining …
Which also enabled increased clock frequency
… caches …
Which became necessary as processor clock frequency increased
… and integrated floating-point
Then deeper pipelines and branch speculation
Then multiple instructions per cycle (superscalar)
Then dynamic scheduling (out-of-order execution)

We will talk about these things

Pinnacle of Single-Core Microprocessors
Intel Pentium4 (2003)
Application: desktop/server
Technology: 90nm

55M transistors

32/64-bit data (16x)
22-stage pipelined datapath
3 instructions per cycle (superscalar)
Two levels of on-chip cache
data-parallel vector (SIMD) instructions, hyperthreading

Revolution III: Explicit Parallelism
Then to support explicit data & thread level parallelism
Hardware provides parallel resources, software specifies usage
Why? diminishing returns on instruction-level-parallelism

First using (subword) vector instructions…, Intel’s SSE
One instruction does four parallel multiplies

… and general support for multi-threaded programs
Coherent caches, hardware synchronization primitives

Then using support for multiple concurrent threads on chip
First with single-core multi-threading, now with multi-core

Graphics processing units (GPUs) are highly parallel

CIS 501: Computer Architecture|Prof.|Introduction
Modern Multicore Processor
AMD EPYC 7H12
Application: server
Technology: 7nm

39.5B transistors
2.6 to 3.3 Ghz
256-bit data (2x)
19-stage pipelined datapath
4 instructions per cycle
292MB of on-chip cache
data-parallel vector (SIMD) instructions, hyperthreading
64-core multicore

image from https://www.servethehome.com/amd-epyc-2-rome-what-we-know-will-change-the-game/

Historical Microprocessor Evolution
FeatureIntel 4004Intel Pentium 4MD EPYC Rome
release date197120042019
transistor size10,000 nm90 nm7 nm, 14 nm
transistor count2,300125M39.5B
area13 mm2112 mm21008 mm2
frequency740 KHz3.8 GHz2.6-3.3 GHz
data width4-bit64-bit256-bit
pipeline stagesn/a3119
pipeline widthn/a34
core count1164
on-chip cachen/a1MB292MB

4004: https://en.wikipedia.org/wiki/Intel_4004
Prescott: https://en.wikipedia.org/wiki/Pentium_4#Prescott, https://techreport.com/review/6213/intels-pentium-4-prescott-processor/
EPYC: https://wccftech.com/amd-2nd-gen-epyc-rome-iod-ccd-chipshots-39-billion-transistors/

CIS 501: Computer Architecture|Prof.|Introduction
Revolution IV: Accelerators
Combining multiple kinds of compute engines in one die
not just homogenous collection of cores
System-on-Chip (SoC) is one common example in mobile space

Lots of stuff on the chip beyond just CPUs
Graphics Processing Units (GPUs)
throughput-oriented specialized multicore processors
good for gaming, machine learning, computer vision, …
Special-purpose logic
media codecs, radios, encryption, compression, machine learning

Excellent energy efficiency and performance
extremely complicated to program!

c/o Qualcomm

Our Zedboard SoC

Cerebras Wafer-Scale Engine
giant 8.5” square chip!
full of deep learning accelerators
18GB on-chip memory
9 PB/sec on-chip memory bandwidth
TSMC 16nm transistors

size of a mousepad

Technology Disruptions
Classic examples:
transistor
microprocessor
More recent examples:
flash-based solid-state storage
shift to accelerators
Nascent disruptive technologies:
non-volatile memory (“disks” as fast as DRAM)
Chip stacking (also called “3D die stacking”)
The end of Moore’s Law
“If something can’t go on forever, it must stop eventually”
Transistor speed/energy efficiency not improving like before

“Golden Age of Computer Architecture”
Hennessy & Patterson, 2018 Turing Laureates
the end of Dennard scaling & Moore’s Law means no more free performance
“The next decade will see a Cambrian explosion of novel computer architectures”

Parallelism
enhance system performance by doing multiple things at once
instruction-level parallelism, multicore, GPUs, accelerators
exploiting locality of reference: storage hierarchies
Try to provide the illusion of a single large, fast memory

moores law intel tsmc samsung

2010 2012 2014 2016 2018 2020 2022

2010 2012 2014 2016 2018 2020 2022

moores law intel tsmc samsung

2010 2012 2014 2016 2018 2020 2022

2010 2012 2014 2016 2018 2020 2022

/docProps/thumbnail.jpeg

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: [email protected]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] CS代考 CS152: Computer Architecture and Engineering
30 $