Can you share following output from shell "% du -a /cf | sort -n -r | more"
Can you share following output from shell "% du -a /cf | sort -n -r | more"
Hi all,
just to update this issue. As per inform by Engineering the issue due to broadcast storm that make the communication between RE and FPC broke.
May i know whether SRX5800 have feature that can block from impact broadcast storm?
Thanks and appreciate any feedback
Hi rsuraj,
Below is the output:
test@srx1500% du -a /cf | sort -n -r | more
702368 /cf
694316 /cf/packages
694272 /cf/packages/junos-srxjcp-15.1X49-D80.4-domestic
6988 /cf/boot
6076 /cf/boot/modules
4048 /cf/boot/modules/mdimage.gz
704 /cf/boot/modules/if_em_vsrx.ko
704 /cf/boot/loader
508 /cf/etc
432 /cf/boot/modules/if_vtnet.ko
324 /cf/sbin
320 /cf/sbin/preinit
288 /cf/boot/modules/mac_runasnonroot.ko
288 /cf/boot/modules/mac_pcap.ko
152 /cf/etc/services
116 /cf/var
80 /cf/etc/spwd.db
80 /cf/etc/pwd.db
72 /cf/boot/support.4th
64 /cf/etc/db
60 /cf/boot/modules/virtio_blk.ko
56 /cf/etc/db/manifest
56 /cf/boot/modules/pci_hgcomm.ko
52 /cf/boot/modules/pvi_db.ko
48 /cf/opt
48 /cf/boot/modules/virtio_pci.ko
48 /cf/boot/modules/virtio.ko
40 /cf/boot/defaults
36 /cf/boot/defaults/loader.conf
32 /cf/boot/loader.help
28 /cf/usr
28 /cf/root
28 /cf/boot/modules/chassis.ko
20 /cf/var/db
20 /cf/boot/modules/libmbpool.ko
16 /cf/var/tmp
16 /cf/packages/certs.pem
16 /cf/etc/ttys
16 /cf/etc/db/manifest/jboot.certs
16 /cf/boot/loader.4th
16 /cf/boot/boot
12 /cf/var/crash
12 /cf/usr/share
12 /cf/usr/lib
12 /cf/packages/ecerts.pem
12 /cf/opt/lib
12 /cf/etc/security
12 /cf/etc/gettytab
12 /cf/etc/db/manifest/jboot.ecerts
12 /cf/etc/db/manifest/jboot
12 /cf/boot/kgzldr.o
8 /cf/var/transfer
8 /cf/var/tmp/BSD.var.dist
8 /cf/var/sw
8 /cf/var/log
8 /cf/var/cron
8 /cf/usr/share/pfe
8 /cf/root/.ssh
8 /cf/opt/sdk
8 /cf/opt/lib/dd
8 /cf/etc/namedb
8 /cf/etc/login.conf.aux
4 /cf/var/validate
4 /cf/var/transfer/config
4 /cf/var/tmp/install
4 /cf/var/tftpboot
4 /cf/var/sw/pkg
4 /cf/var/run
4 /cf/var/pdb
byte 152412 /cf/etc/security
12 /cf/etc/gettytab
12 /cf/etc/db/manifest/jboot.ecerts
12 /cf/etc/db/manifest/jboot
12 /cf/boot/kgzldr.o
8 /cf/var/transfer
8 /cf/var/tmp/BSD.var.dist
8 /cf/var/sw
8 /cf/var/log
8 /cf/var/cron
8 /cf/usr/share/pfe
8 /cf/root/.ssh
8 /cf/opt/sdk
8 /cf/opt/lib/dd
8 /cf/etc/namedb
8 /cf/etc/login.conf.aux
4 /cf/var/validate
4 /cf/var/transfer/config
4 /cf/var/tmp/install
4 /cf/var/tftpboot
4 /cf/var/sw/pkg
4 /cf/var/run
4 /cf/var/pdb
4 /cf/var/log/host
4 /cf/var/jails
4 /cf/var/jail
4 /cf/var/home
4 /cf/var/empty
4 /cf/var/db/host
4 /cf/var/db/fsad
4 /cf/var/db/config
4 /cf/var/db/certs
4 /cf/var/cron/tabs
4 /cf/var/crash/minfree
4 /cf/var/crash/corefiles
4 /cf/usr/share/pfe/firmware
4 /cf/usr/lib/render
4 /cf/usr/lib/dd
4 /cf/tmp
4 /cf/root/.ssh/known_hosts
4 /cf/root/.profile
4 /cf/root/.login
4 /cf/root/.history
4 /cf/root/.cshrc
4 /cf/packages/junos-srxjcp-15.1X49-D80.4-domestic.sig
4 /cf/packages/junos-srxjcp-15.1X49-D80.4-domestic.sha1
4 /cf/packages/junos-srxjcp-15.1X49-D80.4-domestic.esig
4 /cf/opt/tmp
4 /cf/opt/support
4 /cf/opt/sdk/syslog-modules
4 /cf/opt/sbin
4 /cf/opt/plugins
4 /cf/opt/lib/dd/filter
4 /cf/opt/etc
4 /cf/opt/bin
4 /cf/etc/profile
4 /cf/etc/passwd
4 /cf/etc/pam.conf.sys
4 /cf/etc/newsyslog.conf.sys
4 /cf/etc/namedb/rndc.key
4 /cf/etc/motd.sys
4 /cf/etc/motd
4 /cf/etc/master.passwd.sys
4 /cf/etc/login.access
4 /cf/etc/hosts.equiv
4 /cf/etc/host.conf
4 /cf/etc/group.sys
4 /cf/etc/group
4 /cf/etc/ftpusers
4 /cf/etc/fstab.chroot
4 /cf/etc/fstab
4 /cf/etc/db/pkg
4 /cf/etc/db/manifest/jboot.sig
4 /cf/etc/db/manifest/jboot.sha1
4 /cf/etc/db/manifest/jboot.esig
4 /cf/etc/csh.logout
4 /cf/etc/csh.login
4 /cf/etc/csh.cshrc
4 /cf/dev
4 /cf/boot/mbr
4 /cf/boot/loader.rc
4 /cf/boot/loader.conf
4 /cf/boot/boot0
0 /cf/tmp/extract-junos.39772.tgztoc
0 /cf/sbin/oinit
0 /cf/packages/junos
0 /cf/opt/lib/dd/filter/libschema-filter-dd.tlv
0 /cf/kernel.old
0 /cf/kernel
0 /cf/etc/ssh
0 /cf/etc/rc.verify
0 /cf/etc/namedb/resolver.cache
0 /cf/etc/master.passwd
0 /cf/etc/db/pkg/junos
0 /cf/etc/db/manifest/junos.sig
0 /cf/etc/db/manifest/junos.esig
0 /cf/etc/db/manifest/junos.ecerts
0 /cf/etc/db/manifest/junos.certs
0 /cf/etc/db/manifest/junos
Can you confirm Junos image is copied to /var/tmp/ only?
Hi rsuraj,
Yes, first time i copy in /var/tmp/ then i delete and try direct through FTP but still same error said space issue.
I did an upgrade on my SRX and purposely left the old image on the disk - just in case. So now, things look like they are working okay and I was going to snapshot over the old image. Before I do, I have a couple of files that are on there that I need to copy of so I figured I could just reboot and interrupt the boot and tell it to boot from the backup partition but it keeps booting into the updated partition and not the old one.
root@GreatGazoo> show system snapshot media internal Information for snapshot on internal (/dev/da0s1a) (backup) Creation date: Jan 7 06:36:38 2014 JUNOS version on snapshot: junos : 11.4R6.6-domestic Information for snapshot on internal (/dev/da0s2a) (primary) Creation date: Sep 22 14:04:54 2017 JUNOS version on snapshot: junos : 12.1X46-D65.4-domestic
Interrupting the boot and changing the boot.current didn't seem to help either:
loader> show LINES=24 autoboot_delay=2 autoload=n baudrate=9600 boot.btsq.len=0x00002000 boot.btsq.start=0x003fa000 boot.current=backup boot.devlist=nand-flash:usb boot.env.size=0x00002000 boot.env.start=0x003fe000 boot.status=0x2000a boot.upgrade.loader=0xbfe00000 boot.upgrade.loader.data=0x00200000 boot.upgrade.loader.hdr=0x002fffc0 boot.upgrade.uboot=0xbfc00000 boot.upgrade.uboot.data=0x00000100 boot.upgrade.uboot.hdr=0x00000030 boot.ver=1.7 bootcmd=cp.b 0xbfe00000 0x100000 0x100000; bootelf 0x100000 bootdelay=1 bootfile=/kernel;/kernel.old comconsole_speed=9600 console=comconsole currdev=disk0s2
So basically I want to boot to /dev/da0s1a.
Do I need to change the currdev to disk0s1? I've not seen any reference to changing this parameter and figured before I do and brick my device, I'd ask.
Thanks!!!
Hi,
Running 2 x SRX1500 that are currently directly connected via the HA Control Port and 2 x FAB ports (Fibre on ge-0/0/12 and ge-0/0/13 and ge-7/0/12 and ge-7/0/13).
Running the command "show chassis cluster status" show us exactly what we expect, and, actually, the config is shown below:
set chassis cluster reth-count 7
set chassis cluster redundancy-group 0 node 0 priority 100
set chassis cluster redundancy-group 0 node 1 priority 1
set chassis cluster redundancy-group 1 node 0 priority 100
set chassis cluster redundancy-group 1 node 1 priority 1
set interfaces ge-0/0/14 gigether-options redundant-parent reth0
set interfaces ge-0/0/15 gigether-options redundant-parent reth1
set interfaces ge-7/0/14 gigether-options redundant-parent reth0
set interfaces ge-7/0/15 gigether-options redundant-parent reth1
set interfaces fab0 fabric-options member-interfaces ge-0/0/12
set interfaces fab0 fabric-options member-interfaces ge-0/0/13
set interfaces fab1 fabric-options member-interfaces ge-7/0/12
set interfaces fab1 fabric-options member-interfaces ge-7/0/13
set interfaces reth0 redundant-ether-options redundancy-group 1
set interfaces reth0 unit 0 family inet address 192.168.30.1/24
set interfaces reth0 unit 0 family iso
set interfaces reth1 redundant-ether-options redundancy-group 1
set interfaces reth1 unit 0 family inet address 192.168.20.1/24
set interfaces reth1 unit 0 family iso
set groups node0 system host-name THW-SRX-01
set groups node0 system backup-router 192.168.5.3
set groups node0 system backup-router destination 192.168.5.0/24
set groups node0 interfaces fxp0 unit 0 family inet address 192.168.5.1/24
set groups node1 system host-name HEX-SRX-02
set groups node1 interfaces fxp0 unit 0 family inet address 192.168.5.2/24
set apply-groups "${node}"
Now, for the test I unplugged the HA Control port on Unit 0 (Primary)..... Two Fab ports on unit 0 went to Up / Down status and also saw unit 1 as "lost".
On unit 1 we see the fab ports as Up/Up, but the chassis cluster status command showed unit 0 as lost and unit 1 (itself) as inilegible.... very strange, as I would have expected to see it as the primary.
Anyway, plugging the HA cable back in should have resulted in the HA light on the front going green on both chassis and the "show chassis cluster status" back how it was, but, NO.... the following has happened:
Red light still showing (after 20 minutes) on unit 0 for HA Control port..... all four FAB interfaces show as up.... so, let me show you the output from unit 0 (Primary):
{primary:node0}
root@THW-SRX-01> show chassis cluster status
Monitor Failure codes:
CS Cold Sync monitoring FL Fabric Connection monitoring
GR GRES monitoring HW Hardware monitoring
IF Interface monitoring IP IP monitoring
LB Loopback monitoring MB Mbuf monitoring
NH Nexthop monitoring NP NPC monitoring
SP SPU monitoring SM Schedule monitoring
CF Config Sync monitoring
Cluster ID: 1
Node Priority Status Preempt Manual Monitor-failures
Redundancy group: 0 , Failover count: 1
node0 100 primary no no None
node1 1 disabled no no None
Redundancy group: 1 , Failover count: 3
node0 100 primary no no None
node1 1 disabled no no None
root@THW-SRX-01> show chassis cluster interfaces
Control link status: Up
Control interfaces:
Index Interface Monitored-Status Internal-SA
0 em0 Up Disabled
Fabric link status: Up
Fabric interfaces:
Name Child-interface Status
(Physical/Monitored)
fab0 ge-0/0/12 Up / Up
fab0 ge-0/0/13 Up / Up
fab1 ge-7/0/12 Up / Up
fab1 ge-7/0/13 Up / Up
Redundant-ethernet Information:
Name Status Redundancy-group
reth0 Down 1
reth1 Down 1
reth2 Down Not configured
reth3 Down Not configured
reth4 Down Not configured
reth5 Down Not configured
reth6 Down Not configured
Redundant-pseudo-interface Information:
Name Status Redundancy-group
lo0 Up 0
Absolutely nothing looks right.... it looks like the whole thing is failing.
thanks in advance
By the way, if I reboot both devices they come up fine and working exactly as you would expect them to.
I know this is an old post but i don't care. Helped me a ton. Kudos to you
Ryan
You have enough free disk space on /var/ You have over 13GB free. Can you create a directory say named "mytemp"and then use the cli to set that directory to your home directory temporarily and run the install. >set cli directory mytemp
Also delete the backup copy of the software that is on the system
[edit system login retry-options]
-backoff-threshold (1-3) - how many incorrect logins attempts before the delay time is increased as setting the :
-backoff-factor (1-10 seconds) (how to wait before presenting the user with a login prompt)
-tries-before-disconnect (# incorrect login attempts to allow before terminating the telnet/ssh session and allow the user to connect again
-lockout-period (how long minutes to lockout a user account before enabling that account again) >show system login lockout
request system reboot slice alternate media internal
Hi,
It's control-link failure, The system monitors the control link's status by default. You can configure control-link-recovery to force automatic reboot n the disabled node for recovery.
If the control link fails, Junos OS changes the operating state of the secondary node to ineligible for a 180-second countdown. If the fabric link also fails during the 180 seconds, Junos OS changes the secondary node to primary; otherwise, after 180 seconds the secondary node state changes to disabled.
Please refer http://www.juniper.net/documentation/en_US/junos12.1x46/topics/concept/chassis-cluster-control-link-failure-recovery-understanding.html for more details and let us know in case of any doubt.
Thanks,
Vikas
Thouhgt I had said in the original post but guess not - command not available:
root@GreatGazoo> request system reboot ? Possible completions:<[Enter]> Execute this command at Time at which to perform the operation in Number of minutes to delay before operation media Boot media for next boot message Message to display to all users | Pipe through a command root@GreatGazoo> request system reboot sli ^ syntax error. root@GreatGazoo> request system reboot slice al ^ Invalid numeric value: 'al' at 'al' root@GreatGazoo> request system reboot slice alternate ^ Invalid numeric value: 'alternate' at 'alternate' root@GreatGazoo> request system reboot slice alternate ^ Invalid numeric value: 'alternate' at 'alternate' root@GreatGazoo> request system reboot slice alternate
Not on the SRX. And to be effective storm control really needs to enabled at the access layer to stop the storm at the source. You would do this on your EX stacks.
Alternatively, you can mount the backup partition as root and grab the files that way.
root@test> show system snapshot media internal Information for snapshot on internal (/dev/da0s1a) (backup) Creation date: Aug 27 12:13:42 2016 JUNOS version on snapshot: junos : 12.1X46-D55.3-domestic Information for snapshot on internal (/dev/da0s2a) (primary) Creation date: Sep 27 18:48:05 2016 JUNOS version on snapshot: junos : 12.1X46-D55.3-domestic root@test> start shell root@test% mount /dev/da0s1a /mnt root@test% ls /mnt .snap dev kernel.old root bin etc opt sbin boot junos packages usr cf kernel pkg var root@test% umount /mnt root@test%
I hope I do a descent job explaining this. I want to put my SRX210 between my home lan and the internet. The internet access is controlled through my cable modem. If all the home lan machines were in the same subnet as the internal interface of the modem, all was good - duh! So I figured I would just basically turn one interface of the SRX into a connection between the home lan and the modem by natting all the home lan IPs to an address in the range of the modem internal network. I've attached a basic diagram of the setup.
So what happens is that at some point in time, the internal home-lan machines lose access to the internet. When I try to ping an outside IP address, it times out. If I then ping the modem internal address (172.20.15.1), the ping will typically time out once or twice and then work. After that, that same machine will then have internet access again.
So I figured it was a routing/proxy/arp issue. I've combed through the configuration and removed everything except the parts necessary to make this work. I verfied my proxy-arp and routing tables and all looks good. When I monitor the external interface of the SRX (don't have any control of the ISP modem) I see a lot of arp requests for 172.20.15.x ip's but no replies. Does that mean my proxy isn't working? Do I need to maybe make that a /24 for the full 172.20.15 network?
This is the configuration as it's in the SRX now.
set interfaces ge-0/0/0 description "Trunk to Internet: SRX-210 (ge-0/0/0) to MOTOROLA SBG6580" set interfaces ge-0/0/0 gigether-options auto-negotiation set interfaces ge-0/0/0 unit 0 family inet filter input-list FLTR_ALLOW_ALL set interfaces ge-0/0/0 unit 0 family inet sampling input set interfaces ge-0/0/0 unit 0 family inet sampling output set interfaces ge-0/0/0 unit 0 family inet address 172.20.15.254/24 set interfaces ge-0/0/1 description "SRX-210 (ge-0/0/1) to TP-LINK port 1 : Gateway for HOME_LAN" set interfaces ge-0/0/1 gigether-options auto-negotiation set interfaces ge-0/0/1 unit 0 family inet filter input-list FLTR_ALLOW_ALL set interfaces ge-0/0/1 unit 0 family inet sampling input set interfaces ge-0/0/1 unit 0 family inet sampling output set interfaces ge-0/0/1 unit 0 family inet address 10.20.15.254/24 set routing-options static route 0.0.0.0/0 next-hop 172.20.15.1 set security log mode stream set security nat source pool NAT_SRCE_POOL_HOME_LAN description "NAT SOURCE POOL FOR HOME-LAN to INTERNET CONNECTIONS" set security nat source pool NAT_SRCE_POOL_HOME_LAN address 172.20.15.129/26 set security nat source pool NAT_SRCE_POOL_HOME_LAN host-address-base 10.20.15.129/32 set security nat source rule-set NAT_SRCE_HOME_LAN from zone HOME_LAN set security nat source rule-set NAT_SRCE_HOME_LAN to zone Internet set security nat source rule-set NAT_SRCE_HOME_LAN rule HOME-LAN_to_Internet match source-address 10.20.15.129/26 set security nat source rule-set NAT_SRCE_HOME_LAN rule HOME-LAN_to_Internet match destination-address 0.0.0.0/0 set security nat source rule-set NAT_SRCE_HOME_LAN rule HOME-LAN_to_Internet then source-nat pool NAT_SRCE_POOL_HOME_LAN set security nat proxy-arp interface ge-0/0/0.0 address 172.20.15.129/32 to 172.20.15.191/32
set security policies from-zone Internet to-zone HOME_LAN policy policy_startup_rvpn_HOME_LAN match source-address any set security policies from-zone Internet to-zone HOME_LAN policy policy_startup_rvpn_HOME_LAN match destination-address any set security policies from-zone Internet to-zone HOME_LAN policy policy_startup_rvpn_HOME_LAN match application any set security policies from-zone Internet to-zone HOME_LAN policy policy_startup_rvpn_HOME_LAN then permit tunnel ipsec-vpn startup_rvpn
set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet match source-address ADDR_HOME_NAT set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet match source-address any-ipv4 set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet match destination-address any set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet match application any set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet then permit set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet then log session-init set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet then log session-close set security policies from-zone HOME_LAN to-zone Internet policy HOME_LAN-to-Internet then count
set security policies from-zone HOME_LAN to-zone HOME_LAN policy HOME_LAN_HOME_LAN match source-address ADDR_HOME_LAN set security policies from-zone HOME_LAN to-zone HOME_LAN policy HOME_LAN_HOME_LAN match destination-address ADDR_HOME_LAN set security policies from-zone HOME_LAN to-zone HOME_LAN policy HOME_LAN_HOME_LAN match application any set security policies from-zone HOME_LAN to-zone HOME_LAN policy HOME_LAN_HOME_LAN then permit
set security policies default-policy deny-all set security policies policy-rematch
set security zones security-zone Internet description "SRX-210 (ge-0/0/0) to SBG6580 Port 2: Trunk to internet" set security zones security-zone Internet tcp-rst set security zones security-zone Internet screen untrust-screen set security zones security-zone Internet interfaces ge-0/0/0.0 host-inbound-traffic system-services https set security zones security-zone Internet interfaces ge-0/0/0.0 host-inbound-traffic system-services ike set security zones security-zone Internet interfaces ge-0/0/0.0 host-inbound-traffic system-services ping
set security zones security-zone HOME_LAN description "SRX-210 (ge-0/0/1) to TP-Link Port 1: Trunk for HOME_LAN" set security zones security-zone HOME_LAN interfaces ge-0/0/1.0 host-inbound-traffic system-services ping set security zones security-zone HOME_LAN interfaces ge-0/0/1.0 host-inbound-traffic system-services dhcp set security zones security-zone HOME_LAN interfaces ge-0/0/1.0 host-inbound-traffic system-services https set security zones security-zone HOME_LAN interfaces ge-0/0/1.0 host-inbound-traffic system-services ssh
set firewall family inet filter FLTR_ALLOW_ALL term T-01 from source-address 0.0.0.0/0 set firewall family inet filter FLTR_ALLOW_ALL term T-01 from destination-address 0.0.0.0/0 set firewall family inet filter FLTR_ALLOW_ALL term T-01 then count CNT_ALLOW_ALL set firewall family inet filter FLTR_ALLOW_ALL term T-01 then log set firewall family inet filter FLTR_ALLOW_ALL term T-01 then syslog set firewall family inet filter FLTR_ALLOW_ALL term T-01 then accept
This is repeatable. During the typing of this post, I lost connection to the internet. All I had to do was get on 10.20.15.172 (the home-lan DNS server) and ping 172.20.15.1. The first two timed out and then it work. Immediately internet access was restored.
If it makes a difference, the modem is a motorola SBG6580. The gateway and primary network mode are both set to routed. My intent is that what the motorola should see on it's internal interface is a 172.20.15.x packet as that's the NAT function of the SRX.
If this all makes sense, suggestions? I'm not really expecting anyone to be able to troubleshoot this without hands on the keyboard but I would like suggestions on what I can do to find/troubleshoot more. I've verfied via traceoptions that the packet is actually getting natted and hitting the exit interface of the SRX (ge-0/0/0). What I don't see is it coming back in so I'm thinking it's something on the ISP modem but I can't change that so need to fix it on the inside (SRX) somehow.
Thanks for reading and offering any suggestions!!!
Thanks for the suggestion but would that allow me to get to the config and logs as well? I'll have to give that a try. Thanks!!!
HI,
CAn you try with interfae-based source NAT instead pool based and observe the flow?
And also can you log trace with source and destination prefix and attach in both secanrios.